probability distribution method: Topics by Science.gov

Sample records for probability distribution method

Digital simulation of two-dimensional random fields with arbitrary power spectra and non-Gaussian probability distribution functions.

PubMed

Yura, Harold T; Hanson, Steen G

2012-04-01

Methods for simulation of two-dimensional signals with arbitrary power spectral densities and signal amplitude probability density functions are disclosed. The method relies on initially transforming a white noise sample set of random Gaussian distributed numbers into a corresponding set with the desired spectral distribution, after which this colored Gaussian probability distribution is transformed via an inverse transform into the desired probability distribution. In most cases the method provides satisfactory results and can thus be considered an engineering approach. Several illustrative examples with relevance for optics are given.
Comparision of the different probability distributions for earthquake hazard assessment in the North Anatolian Fault Zone

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yilmaz, Şeyda, E-mail: seydayilmaz@ktu.edu.tr; Bayrak, Erdem, E-mail: erdmbyrk@gmail.com; Bayrak, Yusuf, E-mail: bayrak@ktu.edu.tr

In this study we examined and compared the three different probabilistic distribution methods for determining the best suitable model in probabilistic assessment of earthquake hazards. We analyzed a reliable homogeneous earthquake catalogue between a time period 1900-2015 for magnitude M ≥ 6.0 and estimated the probabilistic seismic hazard in the North Anatolian Fault zone (39°-41° N 30°-40° E) using three distribution methods namely Weibull distribution, Frechet distribution and three-parameter Weibull distribution. The distribution parameters suitability was evaluated Kolmogorov-Smirnov (K-S) goodness-of-fit test. We also compared the estimated cumulative probability and the conditional probabilities of occurrence of earthquakes for different elapsed timemore » using these three distribution methods. We used Easyfit and Matlab software to calculate these distribution parameters and plotted the conditional probability curves. We concluded that the Weibull distribution method was the most suitable than other distribution methods in this region.« less
Multinomial mixture model with heterogeneous classification probabilities

USGS Publications Warehouse

Holland, M.D.; Gray, B.R.

2011-01-01

Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
A method to deconvolve stellar rotational velocities II. The probability distribution function via Tikhonov regularization

NASA Astrophysics Data System (ADS)

Christen, Alejandra; Escarate, Pedro; Curé, Michel; Rial, Diego F.; Cassetti, Julia

2016-10-01

Aims: Knowing the distribution of stellar rotational velocities is essential for understanding stellar evolution. Because we measure the projected rotational speed v sin I, we need to solve an ill-posed problem given by a Fredholm integral of the first kind to recover the "true" rotational velocity distribution. Methods: After discretization of the Fredholm integral we apply the Tikhonov regularization method to obtain directly the probability distribution function for stellar rotational velocities. We propose a simple and straightforward procedure to determine the Tikhonov parameter. We applied Monte Carlo simulations to prove that the Tikhonov method is a consistent estimator and asymptotically unbiased. Results: This method is applied to a sample of cluster stars. We obtain confidence intervals using a bootstrap method. Our results are in close agreement with those obtained using the Lucy method for recovering the probability density distribution of rotational velocities. Furthermore, Lucy estimation lies inside our confidence interval. Conclusions: Tikhonov regularization is a highly robust method that deconvolves the rotational velocity probability density function from a sample of v sin I data directly without the need for any convergence criteria.
Lognormal Approximations of Fault Tree Uncertainty Distributions.

PubMed

El-Shanawany, Ashraf Ben; Ardron, Keith H; Walker, Simon P

2018-01-26

Fault trees are used in reliability modeling to create logical models of fault combinations that can lead to undesirable events. The output of a fault tree analysis (the top event probability) is expressed in terms of the failure probabilities of basic events that are input to the model. Typically, the basic event probabilities are not known exactly, but are modeled as probability distributions: therefore, the top event probability is also represented as an uncertainty distribution. Monte Carlo methods are generally used for evaluating the uncertainty distribution, but such calculations are computationally intensive and do not readily reveal the dominant contributors to the uncertainty. In this article, a closed-form approximation for the fault tree top event uncertainty distribution is developed, which is applicable when the uncertainties in the basic events of the model are lognormally distributed. The results of the approximate method are compared with results from two sampling-based methods: namely, the Monte Carlo method and the Wilks method based on order statistics. It is shown that the closed-form expression can provide a reasonable approximation to results obtained by Monte Carlo sampling, without incurring the computational expense. The Wilks method is found to be a useful means of providing an upper bound for the percentiles of the uncertainty distribution while being computationally inexpensive compared with full Monte Carlo sampling. The lognormal approximation method and Wilks's method appear attractive, practical alternatives for the evaluation of uncertainty in the output of fault trees and similar multilinear models. © 2018 Society for Risk Analysis.
Constructing inverse probability weights for continuous exposures: a comparison of methods.

PubMed

Naimi, Ashley I; Moodie, Erica E M; Auger, Nathalie; Kaufman, Jay S

2014-03-01

Inverse probability-weighted marginal structural models with binary exposures are common in epidemiology. Constructing inverse probability weights for a continuous exposure can be complicated by the presence of outliers, and the need to identify a parametric form for the exposure and account for nonconstant exposure variance. We explored the performance of various methods to construct inverse probability weights for continuous exposures using Monte Carlo simulation. We generated two continuous exposures and binary outcomes using data sampled from a large empirical cohort. The first exposure followed a normal distribution with homoscedastic variance. The second exposure followed a contaminated Poisson distribution, with heteroscedastic variance equal to the conditional mean. We assessed six methods to construct inverse probability weights using: a normal distribution, a normal distribution with heteroscedastic variance, a truncated normal distribution with heteroscedastic variance, a gamma distribution, a t distribution (1, 3, and 5 degrees of freedom), and a quantile binning approach (based on 10, 15, and 20 exposure categories). We estimated the marginal odds ratio for a single-unit increase in each simulated exposure in a regression model weighted by the inverse probability weights constructed using each approach, and then computed the bias and mean squared error for each method. For the homoscedastic exposure, the standard normal, gamma, and quantile binning approaches performed best. For the heteroscedastic exposure, the quantile binning, gamma, and heteroscedastic normal approaches performed best. Our results suggest that the quantile binning approach is a simple and versatile way to construct inverse probability weights for continuous exposures.
Fast Reliability Assessing Method for Distribution Network with Distributed Renewable Energy Generation

NASA Astrophysics Data System (ADS)

Chen, Fan; Huang, Shaoxiong; Ding, Jinjin; Ding, Jinjin; Gao, Bo; Xie, Yuguang; Wang, Xiaoming

2018-01-01

This paper proposes a fast reliability assessing method for distribution grid with distributed renewable energy generation. First, the Weibull distribution and the Beta distribution are used to describe the probability distribution characteristics of wind speed and solar irradiance respectively, and the models of wind farm, solar park and local load are built for reliability assessment. Then based on power system production cost simulation probability discretization and linearization power flow, a optimal power flow objected with minimum cost of conventional power generation is to be resolved. Thus a reliability assessment for distribution grid is implemented fast and accurately. The Loss Of Load Probability (LOLP) and Expected Energy Not Supplied (EENS) are selected as the reliability index, a simulation for IEEE RBTS BUS6 system in MATLAB indicates that the fast reliability assessing method calculates the reliability index much faster with the accuracy ensured when compared with Monte Carlo method.
The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions

PubMed Central

Larget, Bret

2013-01-01

In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066
Optimal methods for fitting probability distributions to propagule retention time in studies of zoochorous dispersal.

PubMed

Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi

2016-02-01

Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
Predicting the probability of slip in gait: methodology and distribution study.

PubMed

Gragg, Jared; Yang, James

2016-01-01

The likelihood of a slip is related to the available and required friction for a certain activity, here gait. Classical slip and fall analysis presumed that a walking surface was safe if the difference between the mean available and required friction coefficients exceeded a certain threshold. Previous research was dedicated to reformulating the classical slip and fall theory to include the stochastic variation of the available and required friction when predicting the probability of slip in gait. However, when predicting the probability of a slip, previous researchers have either ignored the variation in the required friction or assumed the available and required friction to be normally distributed. Also, there are no published results that actually give the probability of slip for various combinations of required and available frictions. This study proposes a modification to the equation for predicting the probability of slip, reducing the previous equation from a double-integral to a more convenient single-integral form. Also, a simple numerical integration technique is provided to predict the probability of slip in gait: the trapezoidal method. The effect of the random variable distributions on the probability of slip is also studied. It is shown that both the required and available friction distributions cannot automatically be assumed as being normally distributed. The proposed methods allow for any combination of distributions for the available and required friction, and numerical results are compared to analytical solutions for an error analysis. The trapezoidal method is shown to be highly accurate and efficient. The probability of slip is also shown to be sensitive to the input distributions of the required and available friction. Lastly, a critical value for the probability of slip is proposed based on the number of steps taken by an average person in a single day.
Entropy Methods For Univariate Distributions in Decision Analysis

NASA Astrophysics Data System (ADS)

Abbas, Ali E.

2003-03-01

One of the most important steps in decision analysis practice is the elicitation of the decision-maker's belief about an uncertainty of interest in the form of a representative probability distribution. However, the probability elicitation process is a task that involves many cognitive and motivational biases. Alternatively, the decision-maker may provide other information about the distribution of interest, such as its moments, and the maximum entropy method can be used to obtain a full distribution subject to the given moment constraints. In practice however, decision makers cannot readily provide moments for the distribution, and are much more comfortable providing information about the fractiles of the distribution of interest or bounds on its cumulative probabilities. In this paper we present a graphical method to determine the maximum entropy distribution between upper and lower probability bounds and provide an interpretation for the shape of the maximum entropy distribution subject to fractile constraints, (FMED). We also discuss the problems with the FMED in that it is discontinuous and flat over each fractile interval. We present a heuristic approximation to a distribution if in addition to its fractiles, we also know it is continuous and work through full examples to illustrate the approach.
A discussion on the origin of quantum probabilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holik, Federico, E-mail: olentiev2@gmail.com; Departamento de Matemática - Ciclo Básico Común, Universidad de Buenos Aires - Pabellón III, Ciudad Universitaria, Buenos Aires; Sáenz, Manuel

We study the origin of quantum probabilities as arising from non-Boolean propositional-operational structures. We apply the method developed by Cox to non distributive lattices and develop an alternative formulation of non-Kolmogorovian probability measures for quantum mechanics. By generalizing the method presented in previous works, we outline a general framework for the deduction of probabilities in general propositional structures represented by lattices (including the non-distributive case). -- Highlights: •Several recent works use a derivation similar to that of R.T. Cox to obtain quantum probabilities. •We apply Cox’s method to the lattice of subspaces of the Hilbert space. •We obtain a derivationmore » of quantum probabilities which includes mixed states. •The method presented in this work is susceptible to generalization. •It includes quantum mechanics and classical mechanics as particular cases.« less
A probability space for quantum models

NASA Astrophysics Data System (ADS)

Lemmens, L. F.

2017-06-01

A probability space contains a set of outcomes, a collection of events formed by subsets of the set of outcomes and probabilities defined for all events. A reformulation in terms of propositions allows to use the maximum entropy method to assign the probabilities taking some constraints into account. The construction of a probability space for quantum models is determined by the choice of propositions, choosing the constraints and making the probability assignment by the maximum entropy method. This approach shows, how typical quantum distributions such as Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein are partly related with well-known classical distributions. The relation between the conditional probability density, given some averages as constraints and the appropriate ensemble is elucidated.
Methods to elicit probability distributions from experts: a systematic review of reported practice in health technology assessment.

PubMed

Grigore, Bogdan; Peters, Jaime; Hyde, Christopher; Stein, Ken

2013-11-01

Elicitation is a technique that can be used to obtain probability distribution from experts about unknown quantities. We conducted a methodology review of reports where probability distributions had been elicited from experts to be used in model-based health technology assessments. Databases including MEDLINE, EMBASE and the CRD database were searched from inception to April 2013. Reference lists were checked and citation mapping was also used. Studies describing their approach to the elicitation of probability distributions were included. Data was abstracted on pre-defined aspects of the elicitation technique. Reports were critically appraised on their consideration of the validity, reliability and feasibility of the elicitation exercise. Fourteen articles were included. Across these studies, the most marked features were heterogeneity in elicitation approach and failure to report key aspects of the elicitation method. The most frequently used approaches to elicitation were the histogram technique and the bisection method. Only three papers explicitly considered the validity, reliability and feasibility of the elicitation exercises. Judged by the studies identified in the review, reports of expert elicitation are insufficient in detail and this impacts on the perceived usability of expert-elicited probability distributions. In this context, the wider credibility of elicitation will only be improved by better reporting and greater standardisation of approach. Until then, the advantage of eliciting probability distributions from experts may be lost.
Improved first-order uncertainty method for water-quality modeling

USGS Publications Warehouse

Melching, C.S.; Anmangandla, S.

1992-01-01

Uncertainties are unavoidable in water-quality modeling and subsequent management decisions. Monte Carlo simulation and first-order uncertainty analysis (involving linearization at central values of the uncertain variables) have been frequently used to estimate probability distributions for water-quality model output due to their simplicity. Each method has its drawbacks: Monte Carlo simulation's is mainly computational time; and first-order analysis are mainly questions of accuracy and representativeness, especially for nonlinear systems and extreme conditions. An improved (advanced) first-order method is presented, where the linearization point varies to match the output level whose exceedance probability is sought. The advanced first-order method is tested on the Streeter-Phelps equation to estimate the probability distribution of critical dissolved-oxygen deficit and critical dissolved oxygen using two hypothetical examples from the literature. The advanced first-order method provides a close approximation of the exceedance probability for the Streeter-Phelps model output estimated by Monte Carlo simulation using less computer time - by two orders of magnitude - regardless of the probability distributions assumed for the uncertain model parameters.
Polynomial probability distribution estimation using the method of moments

PubMed Central

Mattsson, Lars; Rydén, Jesper

2017-01-01

We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram–Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation. PMID:28394949
Polynomial probability distribution estimation using the method of moments.

PubMed

Munkhammar, Joakim; Mattsson, Lars; Rydén, Jesper

2017-01-01

We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram-Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation.
The estimation of tree posterior probabilities using conditional clade probability distributions.

PubMed

Larget, Bret

2013-07-01

In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample.
A novel method for correcting scanline-observational bias of discontinuity orientation

PubMed Central

Huang, Lei; Tang, Huiming; Tan, Qinwen; Wang, Dingjian; Wang, Liangqing; Ez Eldin, Mutasim A. M.; Li, Changdong; Wu, Qiong

2016-01-01

Scanline observation is known to introduce an angular bias into the probability distribution of orientation in three-dimensional space. In this paper, numerical solutions expressing the functional relationship between the scanline-observational distribution (in one-dimensional space) and the inherent distribution (in three-dimensional space) are derived using probability theory and calculus under the independence hypothesis of dip direction and dip angle. Based on these solutions, a novel method for obtaining the inherent distribution (also for correcting the bias) is proposed, an approach which includes two procedures: 1) Correcting the cumulative probabilities of orientation according to the solutions, and 2) Determining the distribution of the corrected orientations using approximation methods such as the one-sample Kolmogorov-Smirnov test. The inherent distribution corrected by the proposed method can be used for discrete fracture network (DFN) modelling, which is applied to such areas as rockmass stability evaluation, rockmass permeability analysis, rockmass quality calculation and other related fields. To maximize the correction capacity of the proposed method, the observed sample size is suggested through effectiveness tests for different distribution types, dispersions and sample sizes. The performance of the proposed method and the comparison of its correction capacity with existing methods are illustrated with two case studies. PMID:26961249
Qualitative fusion technique based on information poor system and its application to factor analysis for vibration of rolling bearings

NASA Astrophysics Data System (ADS)

Xia, Xintao; Wang, Zhongyu

2008-10-01

For some methods of stability analysis of a system using statistics, it is difficult to resolve the problems of unknown probability distribution and small sample. Therefore, a novel method is proposed in this paper to resolve these problems. This method is independent of probability distribution, and is useful for small sample systems. After rearrangement of the original data series, the order difference and two polynomial membership functions are introduced to estimate the true value, the lower bound and the supper bound of the system using fuzzy-set theory. Then empirical distribution function is investigated to ensure confidence level above 95%, and the degree of similarity is presented to evaluate stability of the system. Cases of computer simulation investigate stable systems with various probability distribution, unstable systems with linear systematic errors and periodic systematic errors and some mixed systems. The method of analysis for systematic stability is approved.

Statistical tests for whether a given set of independent, identically distributed draws comes from a specified probability density.

PubMed

Tygert, Mark

2010-09-21

We discuss several tests for determining whether a given set of independent and identically distributed (i.i.d.) draws does not come from a specified probability density function. The most commonly used are Kolmogorov-Smirnov tests, particularly Kuiper's variant, which focus on discrepancies between the cumulative distribution function for the specified probability density and the empirical cumulative distribution function for the given set of i.i.d. draws. Unfortunately, variations in the probability density function often get smoothed over in the cumulative distribution function, making it difficult to detect discrepancies in regions where the probability density is small in comparison with its values in surrounding regions. We discuss tests without this deficiency, complementing the classical methods. The tests of the present paper are based on the plain fact that it is unlikely to draw a random number whose probability is small, provided that the draw is taken from the same distribution used in calculating the probability (thus, if we draw a random number whose probability is small, then we can be confident that we did not draw the number from the same distribution used in calculating the probability).
Classic maximum entropy recovery of the average joint distribution of apparent FRET efficiency and fluorescence photons for single-molecule burst measurements.

PubMed

DeVore, Matthew S; Gull, Stephen F; Johnson, Carey K

2012-04-05

We describe a method for analysis of single-molecule Förster resonance energy transfer (FRET) burst measurements using classic maximum entropy. Classic maximum entropy determines the Bayesian inference for the joint probability describing the total fluorescence photons and the apparent FRET efficiency. The method was tested with simulated data and then with DNA labeled with fluorescent dyes. The most probable joint distribution can be marginalized to obtain both the overall distribution of fluorescence photons and the apparent FRET efficiency distribution. This method proves to be ideal for determining the distance distribution of FRET-labeled biomolecules, and it successfully predicts the shape of the recovered distributions.
Digital simulation of an arbitrary stationary stochastic process by spectral representation.

PubMed

Yura, Harold T; Hanson, Steen G

2011-04-01

In this paper we present a straightforward, efficient, and computationally fast method for creating a large number of discrete samples with an arbitrary given probability density function and a specified spectral content. The method relies on initially transforming a white noise sample set of random Gaussian distributed numbers into a corresponding set with the desired spectral distribution, after which this colored Gaussian probability distribution is transformed via an inverse transform into the desired probability distribution. In contrast to previous work, where the analyses were limited to auto regressive and or iterative techniques to obtain satisfactory results, we find that a single application of the inverse transform method yields satisfactory results for a wide class of arbitrary probability distributions. Although a single application of the inverse transform technique does not conserve the power spectra exactly, it yields highly accurate numerical results for a wide range of probability distributions and target power spectra that are sufficient for system simulation purposes and can thus be regarded as an accurate engineering approximation, which can be used for wide range of practical applications. A sufficiency condition is presented regarding the range of parameter values where a single application of the inverse transform method yields satisfactory agreement between the simulated and target power spectra, and a series of examples relevant for the optics community are presented and discussed. Outside this parameter range the agreement gracefully degrades but does not distort in shape. Although we demonstrate the method here focusing on stationary random processes, we see no reason why the method could not be extended to simulate non-stationary random processes. © 2011 Optical Society of America
Burst wait time simulation of CALIBAN reactor at delayed super-critical state

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humbert, P.; Authier, N.; Richard, B.

2012-07-01

In the past, the super prompt critical wait time probability distribution was measured on CALIBAN fast burst reactor [4]. Afterwards, these experiments were simulated with a very good agreement by solving the non-extinction probability equation [5]. Recently, the burst wait time probability distribution has been measured at CEA-Valduc on CALIBAN at different delayed super-critical states [6]. However, in the delayed super-critical case the non-extinction probability does not give access to the wait time distribution. In this case it is necessary to compute the time dependent evolution of the full neutron count number probability distribution. In this paper we present themore » point model deterministic method used to calculate the probability distribution of the wait time before a prescribed count level taking into account prompt neutrons and delayed neutron precursors. This method is based on the solution of the time dependent adjoint Kolmogorov master equations for the number of detections using the generating function methodology [8,9,10] and inverse discrete Fourier transforms. The obtained results are then compared to the measurements and Monte-Carlo calculations based on the algorithm presented in [7]. (authors)« less
Metocean design parameter estimation for fixed platform based on copula functions

NASA Astrophysics Data System (ADS)

Zhai, Jinjin; Yin, Qilin; Dong, Sheng

2017-08-01

Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.
An efficient distribution method for nonlinear transport problems in highly heterogeneous stochastic porous media

NASA Astrophysics Data System (ADS)

Ibrahima, Fayadhoi; Meyer, Daniel; Tchelepi, Hamdi

2016-04-01

Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are crucial to explore possible scenarios and assess risks in subsurface problems. In particular, nonlinear two-phase flows in porous media are essential, yet challenging, in reservoir simulation and hydrology. Adding highly heterogeneous and uncertain input, such as the permeability and porosity fields, transforms the estimation of the flow response into a tough stochastic problem for which computationally expensive Monte Carlo (MC) simulations remain the preferred option.We propose an alternative approach to evaluate the probability distribution of the (water) saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the (water) saturation. The distribution method draws inspiration from a Lagrangian approach of the stochastic transport problem and expresses the saturation PDF and CDF essentially in terms of a deterministic mapping and the distribution and statistics of scalar random fields. In a large class of applications these random fields can be estimated at low computational costs (few MC runs), thus making the distribution method attractive. Even though the method relies on a key assumption of fixed streamlines, we show that it performs well for high input variances, which is the case of interest. Once the saturation distribution is determined, any one-point statistics thereof can be obtained, especially the saturation average and standard deviation. Moreover, the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be efficiently derived from the distribution method. These statistics can then be used for risk assessment, as well as data assimilation and uncertainty reduction in the prior knowledge of input distributions. We provide various examples and comparisons with MC simulations to illustrate the performance of the method.
A comparison of numerical solutions of partial differential equations with probabilistic and possibilistic parameters for the quantification of uncertainty in subsurface solute transport.

PubMed

Zhang, Kejiang; Achari, Gopal; Li, Hua

2009-11-03

Traditionally, uncertainty in parameters are represented as probabilistic distributions and incorporated into groundwater flow and contaminant transport models. With the advent of newer uncertainty theories, it is now understood that stochastic methods cannot properly represent non random uncertainties. In the groundwater flow and contaminant transport equations, uncertainty in some parameters may be random, whereas those of others may be non random. The objective of this paper is to develop a fuzzy-stochastic partial differential equation (FSPDE) model to simulate conditions where both random and non random uncertainties are involved in groundwater flow and solute transport. Three potential solution techniques namely, (a) transforming a probability distribution to a possibility distribution (Method I) then a FSPDE becomes a fuzzy partial differential equation (FPDE), (b) transforming a possibility distribution to a probability distribution (Method II) and then a FSPDE becomes a stochastic partial differential equation (SPDE), and (c) the combination of Monte Carlo methods and FPDE solution techniques (Method III) are proposed and compared. The effects of these three methods on the predictive results are investigated by using two case studies. The results show that the predictions obtained from Method II is a specific case of that got from Method I. When an exact probabilistic result is needed, Method II is suggested. As the loss or gain of information during a probability-possibility (or vice versa) transformation cannot be quantified, their influences on the predictive results is not known. Thus, Method III should probably be preferred for risk assessments.
The exact probability distribution of the rank product statistics for replicated experiments.

PubMed

Eisinga, Rob; Breitling, Rainer; Heskes, Tom

2013-03-18

The rank product method is a widely accepted technique for detecting differentially regulated genes in replicated microarray experiments. To approximate the sampling distribution of the rank product statistic, the original publication proposed a permutation approach, whereas recently an alternative approximation based on the continuous gamma distribution was suggested. However, both approximations are imperfect for estimating small tail probabilities. In this paper we relate the rank product statistic to number theory and provide a derivation of its exact probability distribution and the true tail probabilities. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Classic Maximum Entropy Recovery of the Average Joint Distribution of Apparent FRET Efficiency and Fluorescence Photons for Single-molecule Burst Measurements

PubMed Central

DeVore, Matthew S.; Gull, Stephen F.; Johnson, Carey K.

2012-01-01

We describe a method for analysis of single-molecule Förster resonance energy transfer (FRET) burst measurements using classic maximum entropy. Classic maximum entropy determines the Bayesian inference for the joint probability describing the total fluorescence photons and the apparent FRET efficiency. The method was tested with simulated data and then with DNA labeled with fluorescent dyes. The most probable joint distribution can be marginalized to obtain both the overall distribution of fluorescence photons and the apparent FRET efficiency distribution. This method proves to be ideal for determining the distance distribution of FRET-labeled biomolecules, and it successfully predicts the shape of the recovered distributions. PMID:22338694
Modeling highway travel time distribution with conditional probability models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oliveira Neto, Francisco Moraes; Chin, Shih-Miao; Hwang, Ho-Ling

ABSTRACT Under the sponsorship of the Federal Highway Administration's Office of Freight Management and Operations, the American Transportation Research Institute (ATRI) has developed performance measures through the Freight Performance Measures (FPM) initiative. Under this program, travel speed information is derived from data collected using wireless based global positioning systems. These telemetric data systems are subscribed and used by trucking industry as an operations management tool. More than one telemetric operator submits their data dumps to ATRI on a regular basis. Each data transmission contains truck location, its travel time, and a clock time/date stamp. Data from the FPM program providesmore » a unique opportunity for studying the upstream-downstream speed distributions at different locations, as well as different time of the day and day of the week. This research is focused on the stochastic nature of successive link travel speed data on the continental United States Interstates network. Specifically, a method to estimate route probability distributions of travel time is proposed. This method uses the concepts of convolution of probability distributions and bivariate, link-to-link, conditional probability to estimate the expected distributions for the route travel time. Major contribution of this study is the consideration of speed correlation between upstream and downstream contiguous Interstate segments through conditional probability. The established conditional probability distributions, between successive segments, can be used to provide travel time reliability measures. This study also suggests an adaptive method for calculating and updating route travel time distribution as new data or information is added. This methodology can be useful to estimate performance measures as required by the recent Moving Ahead for Progress in the 21st Century Act (MAP 21).« less
An innovative method for offshore wind farm site selection based on the interval number with probability distribution

NASA Astrophysics Data System (ADS)

Wu, Yunna; Chen, Kaifeng; Xu, Hu; Xu, Chuanbo; Zhang, Haobo; Yang, Meng

2017-12-01

There is insufficient research relating to offshore wind farm site selection in China. The current methods for site selection have some defects. First, information loss is caused by two aspects: the implicit assumption that the probability distribution on the interval number is uniform; and ignoring the value of decision makers' (DMs') common opinion on the criteria information evaluation. Secondly, the difference in DMs' utility function has failed to receive attention. An innovative method is proposed in this article to solve these drawbacks. First, a new form of interval number and its weighted operator are proposed to reflect the uncertainty and reduce information loss. Secondly, a new stochastic dominance degree is proposed to quantify the interval number with a probability distribution. Thirdly, a two-stage method integrating the weighted operator with stochastic dominance degree is proposed to evaluate the alternatives. Finally, a case from China proves the effectiveness of this method.
Probability distributions for multimeric systems.

PubMed

Albert, Jaroslav; Rooman, Marianne

2016-01-01

We propose a fast and accurate method of obtaining the equilibrium mono-modal joint probability distributions for multimeric systems. The method necessitates only two assumptions: the copy number of all species of molecule may be treated as continuous; and, the probability density functions (pdf) are well-approximated by multivariate skew normal distributions (MSND). Starting from the master equation, we convert the problem into a set of equations for the statistical moments which are then expressed in terms of the parameters intrinsic to the MSND. Using an optimization package on Mathematica, we minimize a Euclidian distance function comprising of a sum of the squared difference between the left and the right hand sides of these equations. Comparison of results obtained via our method with those rendered by the Gillespie algorithm demonstrates our method to be highly accurate as well as efficient.
Estimation of distributional parameters for censored trace level water quality data: 1. Estimation techniques

USGS Publications Warehouse

Gilliom, Robert J.; Helsel, Dennis R.

1986-01-01

A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
Estimation of distributional parameters for censored trace level water quality data. 1. Estimation Techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilliom, R.J.; Helsel, D.R.

1986-02-01

A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Estimation of distributional parameters for censored trace-level water-quality data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilliom, R.J.; Helsel, D.R.

1984-01-01

A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
The maximum entropy method of moments and Bayesian probability theory

NASA Astrophysics Data System (ADS)

Bretthorst, G. Larry

2013-08-01

The problem of density estimation occurs in many disciplines. For example, in MRI it is often necessary to classify the types of tissues in an image. To perform this classification one must first identify the characteristics of the tissues to be classified. These characteristics might be the intensity of a T1 weighted image and in MRI many other types of characteristic weightings (classifiers) may be generated. In a given tissue type there is no single intensity that characterizes the tissue, rather there is a distribution of intensities. Often this distributions can be characterized by a Gaussian, but just as often it is much more complicated. Either way, estimating the distribution of intensities is an inference problem. In the case of a Gaussian distribution, one must estimate the mean and standard deviation. However, in the Non-Gaussian case the shape of the density function itself must be inferred. Three common techniques for estimating density functions are binned histograms [1, 2], kernel density estimation [3, 4], and the maximum entropy method of moments [5, 6]. In the introduction, the maximum entropy method of moments will be reviewed. Some of its problems and conditions under which it fails will be discussed. Then in later sections, the functional form of the maximum entropy method of moments probability distribution will be incorporated into Bayesian probability theory. It will be shown that Bayesian probability theory solves all of the problems with the maximum entropy method of moments. One gets posterior probabilities for the Lagrange multipliers, and, finally, one can put error bars on the resulting estimated density function.
Geodesic Monte Carlo on Embedded Manifolds

PubMed Central

Byrne, Simon; Girolami, Mark

2013-01-01

Markov chain Monte Carlo methods explicitly defined on the manifold of probability distributions have recently been established. These methods are constructed from diffusions across the manifold and the solution of the equations describing geodesic flows in the Hamilton–Jacobi representation. This paper takes the differential geometric basis of Markov chain Monte Carlo further by considering methods to simulate from probability distributions that themselves are defined on a manifold, with common examples being classes of distributions describing directional statistics. Proposal mechanisms are developed based on the geodesic flows over the manifolds of support for the distributions, and illustrative examples are provided for the hypersphere and Stiefel manifold of orthonormal matrices. PMID:25309024
Moment and maximum likelihood estimators for Weibull distributions under length- and area-biased sampling

Treesearch

Jeffrey H. Gove

2003-01-01

Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a...
Regional probability distribution of the annual reference evapotranspiration and its effective parameters in Iran

NASA Astrophysics Data System (ADS)

Khanmohammadi, Neda; Rezaie, Hossein; Montaseri, Majid; Behmanesh, Javad

2017-10-01

The reference evapotranspiration (ET0) plays an important role in water management plans in arid or semi-arid countries such as Iran. For this reason, the regional analysis of this parameter is important. But, ET0 process is affected by several meteorological parameters such as wind speed, solar radiation, temperature and relative humidity. Therefore, the effect of distribution type of effective meteorological variables on ET0 distribution was analyzed. For this purpose, the regional probability distribution of the annual ET0 and its effective parameters were selected. Used data in this research was recorded data at 30 synoptic stations of Iran during 1960-2014. Using the probability plot correlation coefficient (PPCC) test and the L-moment method, five common distributions were compared and the best distribution was selected. The results of PPCC test and L-moment diagram indicated that the Pearson type III distribution was the best probability distribution for fitting annual ET0 and its four effective parameters. The results of RMSE showed that the ability of the PPCC test and L-moment method for regional analysis of reference evapotranspiration and its effective parameters was similar. The results also showed that the distribution type of the parameters which affected ET0 values can affect the distribution of reference evapotranspiration.
Study on probability distributions for evolution in modified extremal optimization

NASA Astrophysics Data System (ADS)

Zeng, Guo-Qiang; Lu, Yong-Zai; Mao, Wei-Jie; Chu, Jian

2010-05-01

It is widely believed that the power-law is a proper probability distribution being effectively applied for evolution in τ-EO (extremal optimization), a general-purpose stochastic local-search approach inspired by self-organized criticality, and its applications in some NP-hard problems, e.g., graph partitioning, graph coloring, spin glass, etc. In this study, we discover that the exponential distributions or hybrid ones (e.g., power-laws with exponential cutoff) being popularly used in the research of network sciences may replace the original power-laws in a modified τ-EO method called self-organized algorithm (SOA), and provide better performances than other statistical physics oriented methods, such as simulated annealing, τ-EO and SOA etc., from the experimental results on random Euclidean traveling salesman problems (TSP) and non-uniform instances. From the perspective of optimization, our results appear to demonstrate that the power-law is not the only proper probability distribution for evolution in EO-similar methods at least for TSP, the exponential and hybrid distributions may be other choices.

Count data, detection probabilities, and the demography, dynamics, distribution, and decline of amphibians.

PubMed

Schmidt, Benedikt R

2003-08-01

The evidence for amphibian population declines is based on count data that were not adjusted for detection probabilities. Such data are not reliable even when collected using standard methods. The formula C = Np (where C is a count, N the true parameter value, and p is a detection probability) relates count data to demography, population size, or distributions. With unadjusted count data, one assumes a linear relationship between C and N and that p is constant. These assumptions are unlikely to be met in studies of amphibian populations. Amphibian population data should be based on methods that account for detection probabilities.
Random Partition Distribution Indexed by Pairwise Information

PubMed Central

Dahl, David B.; Day, Ryan; Tsai, Jerry W.

2017-01-01

We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g., hierarchical clustering), but we show how to use this type of information to define a prior partition distribution for flexible Bayesian modeling. A defining feature of the distribution is that it allocates probability among partitions within a given number of subsets, but it does not shift probability among sets of partitions with different numbers of subsets. Our distribution places more probability on partitions that group similar items yet keeps the total probability of partitions with a given number of subsets constant. The distribution of the number of subsets (and its moments) is available in closed-form and is not a function of the similarities. Our formulation has an explicit probability mass function (with a tractable normalizing constant) so the full suite of MCMC methods may be used for posterior inference. We compare our distribution with several existing partition distributions, showing that our formulation has attractive properties. We provide three demonstrations to highlight the features and relative performance of our distribution. PMID:29276318
Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities

USGS Publications Warehouse

Asquith, William H.; Kiang, Julie E.; Cohn, Timothy A.

2017-07-17

The U.S. Geological Survey (USGS), in cooperation with the U.S. Nuclear Regulatory Commission, has investigated statistical methods for probabilistic flood hazard assessment to provide guidance on very low annual exceedance probability (AEP) estimation of peak-streamflow frequency and the quantification of corresponding uncertainties using streamgage-specific data. The term “very low AEP” implies exceptionally rare events defined as those having AEPs less than about 0.001 (or 1 × 10–3 in scientific notation or for brevity 10–3). Such low AEPs are of great interest to those involved with peak-streamflow frequency analyses for critical infrastructure, such as nuclear power plants. Flood frequency analyses at streamgages are most commonly based on annual instantaneous peak streamflow data and a probability distribution fit to these data. The fitted distribution provides a means to extrapolate to very low AEPs. Within the United States, the Pearson type III probability distribution, when fit to the base-10 logarithms of streamflow, is widely used, but other distribution choices exist. The USGS-PeakFQ software, implementing the Pearson type III within the Federal agency guidelines of Bulletin 17B (method of moments) and updates to the expected moments algorithm (EMA), was specially adapted for an “Extended Output” user option to provide estimates at selected AEPs from 10–3 to 10–6. Parameter estimation methods, in addition to product moments and EMA, include L-moments, maximum likelihood, and maximum product of spacings (maximum spacing estimation). This study comprehensively investigates multiple distributions and parameter estimation methods for two USGS streamgages (01400500 Raritan River at Manville, New Jersey, and 01638500 Potomac River at Point of Rocks, Maryland). The results of this study specifically involve the four methods for parameter estimation and up to nine probability distributions, including the generalized extreme value, generalized log-normal, generalized Pareto, and Weibull. Uncertainties in streamflow estimates for corresponding AEP are depicted and quantified as two primary forms: quantile (aleatoric [random sampling] uncertainty) and distribution-choice (epistemic [model] uncertainty). Sampling uncertainties of a given distribution are relatively straightforward to compute from analytical or Monte Carlo-based approaches. Distribution-choice uncertainty stems from choices of potentially applicable probability distributions for which divergence among the choices increases as AEP decreases. Conventional goodness-of-fit statistics, such as Cramér-von Mises, and L-moment ratio diagrams are demonstrated in order to hone distribution choice. The results generally show that distribution choice uncertainty is larger than sampling uncertainty for very low AEP values.
Precipitation intensity probability distribution modelling for hydrological and construction design purposes

NASA Astrophysics Data System (ADS)

Koshinchanov, Georgy; Dimitrov, Dobri

2008-11-01

The characteristics of rainfall intensity are important for many purposes, including design of sewage and drainage systems, tuning flood warning procedures, etc. Those estimates are usually statistical estimates of the intensity of precipitation realized for certain period of time (e.g. 5, 10 min., etc) with different return period (e.g. 20, 100 years, etc). The traditional approach in evaluating the mentioned precipitation intensities is to process the pluviometer's records and fit probability distribution to samples of intensities valid for certain locations ore regions. Those estimates further become part of the state regulations to be used for various economic activities. Two problems occur using the mentioned approach: 1. Due to various factors the climate conditions are changed and the precipitation intensity estimates need regular update; 2. As far as the extremes of the probability distribution are of particular importance for the practice, the methodology of the distribution fitting needs specific attention to those parts of the distribution. The aim of this paper is to make review of the existing methodologies for processing the intensive rainfalls and to refresh some of the statistical estimates for the studied areas. The methodologies used in Bulgaria for analyzing the intensive rainfalls and produce relevant statistical estimates: The method of the maximum intensity, used in the National Institute of Meteorology and Hydrology to process and decode the pluviometer's records, followed by distribution fitting for each precipitation duration period; As the above, but with separate modeling of probability distribution for the middle and high probability quantiles. Method is similar to the first one, but with a threshold of 0,36 mm/min of intensity; Another method proposed by the Russian hydrologist G. A. Aleksiev for regionalization of estimates over some territory, improved and adapted by S. Gerasimov for Bulgaria; Next method is considering only the intensive rainfalls (if any) during the day with the maximal annual daily precipitation total for a given year; Conclusions are drown on the relevance and adequacy of the applied methods.
Multiobjective fuzzy stochastic linear programming problems with inexact probability distribution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hamadameen, Abdulqader Othman; Zainuddin, Zaitul Marlizawati

This study deals with multiobjective fuzzy stochastic linear programming problems with uncertainty probability distribution which are defined as fuzzy assertions by ambiguous experts. The problem formulation has been presented and the two solutions strategies are; the fuzzy transformation via ranking function and the stochastic transformation when α{sup –}. cut technique and linguistic hedges are used in the uncertainty probability distribution. The development of Sen’s method is employed to find a compromise solution, supported by illustrative numerical example.
Proposal of a method for evaluating tsunami risk using response-surface methodology

NASA Astrophysics Data System (ADS)

Fukutani, Y.

2017-12-01

Information on probabilistic tsunami inundation hazards is needed to define and evaluate tsunami risk. Several methods for calculating these hazards have been proposed (e.g. Løvholt et al. (2012), Thio (2012), Fukutani et al. (2014), Goda et al. (2015)). However, these methods are inefficient, and their calculation cost is high, since they require multiple tsunami numerical simulations, therefore lacking versatility. In this study, we proposed a simpler method for tsunami risk evaluation using response-surface methodology. Kotani et al. (2016) proposed an evaluation method for the probabilistic distribution of tsunami wave-height using a response-surface methodology. We expanded their study and developed a probabilistic distribution of tsunami inundation depth. We set the depth (x1) and the slip (x2) of an earthquake fault as explanatory variables and tsunami inundation depth (y) as an object variable. Subsequently, tsunami risk could be evaluated by conducting a Monte Carlo simulation, assuming that the generation probability of an earthquake follows a Poisson distribution, the probability distribution of tsunami inundation depth follows the distribution derived from a response-surface, and the damage probability of a target follows a log normal distribution. We applied the proposed method to a wood building located on the coast of Tokyo Bay. We implemented a regression analysis based on the results of 25 tsunami numerical calculations and developed a response-surface, which was defined as y=ax1+bx2+c (a:0.2615, b:3.1763, c=-1.1802). We assumed proper probabilistic distribution for earthquake generation, inundation height, and vulnerability. Based on these probabilistic distributions, we conducted Monte Carlo simulations of 1,000,000 years. We clarified that the expected damage probability of the studied wood building is 22.5%, assuming that an earthquake occurs. The proposed method is therefore a useful and simple way to evaluate tsunami risk using a response-surface and Monte Carlo simulation without conducting multiple tsunami numerical simulations.
Probability distribution functions for unit hydrographs with optimization using genetic algorithm

NASA Astrophysics Data System (ADS)

Ghorbani, Mohammad Ali; Singh, Vijay P.; Sivakumar, Bellie; H. Kashani, Mahsa; Atre, Atul Arvind; Asadi, Hakimeh

2017-05-01

A unit hydrograph (UH) of a watershed may be viewed as the unit pulse response function of a linear system. In recent years, the use of probability distribution functions (pdfs) for determining a UH has received much attention. In this study, a nonlinear optimization model is developed to transmute a UH into a pdf. The potential of six popular pdfs, namely two-parameter gamma, two-parameter Gumbel, two-parameter log-normal, two-parameter normal, three-parameter Pearson distribution, and two-parameter Weibull is tested on data from the Lighvan catchment in Iran. The probability distribution parameters are determined using the nonlinear least squares optimization method in two ways: (1) optimization by programming in Mathematica; and (2) optimization by applying genetic algorithm. The results are compared with those obtained by the traditional linear least squares method. The results show comparable capability and performance of two nonlinear methods. The gamma and Pearson distributions are the most successful models in preserving the rising and recession limbs of the unit hydographs. The log-normal distribution has a high ability in predicting both the peak flow and time to peak of the unit hydrograph. The nonlinear optimization method does not outperform the linear least squares method in determining the UH (especially for excess rainfall of one pulse), but is comparable.
An efficient distribution method for nonlinear transport problems in stochastic porous media

NASA Astrophysics Data System (ADS)

Ibrahima, F.; Tchelepi, H.; Meyer, D. W.

2015-12-01

Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are convenient to explore possible scenarios and assess risks in subsurface problems. In particular, understanding how uncertainties propagate in porous media with nonlinear two-phase flow is essential, yet challenging, in reservoir simulation and hydrology. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the water saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. The method draws inspiration from the streamline approach and expresses the distributions of interest essentially in terms of an analytically derived mapping and the distribution of the time of flight. In a large class of applications the latter can be estimated at low computational costs (even via conventional Monte Carlo). Once the water saturation distribution is determined, any one-point statistics thereof can be obtained, especially its average and standard deviation. Moreover, rarely available in other approaches, yet crucial information such as the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be derived from the method. We provide various examples and comparisons with Monte Carlo simulations to illustrate the performance of the method.
Using type IV Pearson distribution to calculate the probabilities of underrun and overrun of lists of multiple cases.

PubMed

Wang, Jihan; Yang, Kai

2014-07-01

An efficient operating room needs both little underutilised and overutilised time to achieve optimal cost efficiency. The probabilities of underrun and overrun of lists of cases can be estimated by a well defined duration distribution of the lists. To propose a method of predicting the probabilities of underrun and overrun of lists of cases using Type IV Pearson distribution to support case scheduling. Six years of data were collected. The first 5 years of data were used to fit distributions and estimate parameters. The data from the last year were used as testing data to validate the proposed methods. The percentiles of the duration distribution of lists of cases were calculated by Type IV Pearson distribution and t-distribution. Monte Carlo simulation was conducted to verify the accuracy of percentiles defined by the proposed methods. Operating rooms in John D. Dingell VA Medical Center, United States, from January 2005 to December 2011. Differences between the proportion of lists of cases that were completed within the percentiles of the proposed duration distribution of the lists and the corresponding percentiles. Compared with the t-distribution, the proposed new distribution is 8.31% (0.38) more accurate on average and 14.16% (0.19) more accurate in calculating the probabilities at the 10th and 90th percentiles of the distribution, which is a major concern of operating room schedulers. The absolute deviations between the percentiles defined by Type IV Pearson distribution and those from Monte Carlo simulation varied from 0.20 min (0.01) to 0.43 min (0.03). Operating room schedulers can rely on the most recent 10 cases with the same combination of surgeon and procedure(s) for distribution parameter estimation to plan lists of cases. Values are mean (SEM). The proposed Type IV Pearson distribution is more accurate than t-distribution to estimate the probabilities of underrun and overrun of lists of cases. However, as not all the individual case durations followed log-normal distributions, there was some deviation from the true duration distribution of the lists.
Robust approaches to quantification of margin and uncertainty for sparse data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hund, Lauren; Schroeder, Benjamin B.; Rumsey, Kelin

Characterizing the tails of probability distributions plays a key role in quantification of margins and uncertainties (QMU), where the goal is characterization of low probability, high consequence events based on continuous measures of performance. When data are collected using physical experimentation, probability distributions are typically fit using statistical methods based on the collected data, and these parametric distributional assumptions are often used to extrapolate about the extreme tail behavior of the underlying probability distribution. In this project, we character- ize the risk associated with such tail extrapolation. Specifically, we conducted a scaling study to demonstrate the large magnitude of themore » risk; then, we developed new methods for communicat- ing risk associated with tail extrapolation from unvalidated statistical models; lastly, we proposed a Bayesian data-integration framework to mitigate tail extrapolation risk through integrating ad- ditional information. We conclude that decision-making using QMU is a complex process that cannot be achieved using statistical analyses alone.« less
Detecting background changes in environments with dynamic foreground by separating probability distribution function mixtures using Pearson's method of moments

NASA Astrophysics Data System (ADS)

Jenkins, Colleen; Jordan, Jay; Carlson, Jeff

2007-02-01

This paper presents parameter estimation techniques useful for detecting background changes in a video sequence with extreme foreground activity. A specific application of interest is automated detection of the covert placement of threats (e.g., a briefcase bomb) inside crowded public facilities. We propose that a histogram of pixel intensity acquired from a fixed mounted camera over time for a series of images will be a mixture of two Gaussian functions: the foreground probability distribution function and background probability distribution function. We will use Pearson's Method of Moments to separate the two probability distribution functions. The background function can then be "remembered" and changes in the background can be detected. Subsequent comparisons of background estimates are used to detect changes. Changes are flagged to alert security forces to the presence and location of potential threats. Results are presented that indicate the significant potential for robust parameter estimation techniques as applied to video surveillance.
An evaluation of procedures to estimate monthly precipitation probabilities

NASA Astrophysics Data System (ADS)

Legates, David R.

1991-01-01

Many frequency distributions have been used to evaluate monthly precipitation probabilities. Eight of these distributions (including Pearson type III, extreme value, and transform normal probability density functions) are comparatively examined to determine their ability to represent accurately variations in monthly precipitation totals for global hydroclimatological analyses. Results indicate that a modified version of the Box-Cox transform-normal distribution more adequately describes the 'true' precipitation distribution than does any of the other methods. This assessment was made using a cross-validation procedure for a global network of 253 stations for which at least 100 years of monthly precipitation totals were available.
Flood Frequency Curves - Use of information on the likelihood of extreme floods

NASA Astrophysics Data System (ADS)

Faber, B.

2011-12-01

Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.
Use of Bayesian Inference in Crystallographic Structure Refinement via Full Diffraction Profile Analysis

PubMed Central

Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.

2016-01-01

A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
A comparison of two methods for expert elicitation in health technology assessments.

PubMed

Grigore, Bogdan; Peters, Jaime; Hyde, Christopher; Stein, Ken

2016-07-26

When data needed to inform parameters in decision models are lacking, formal elicitation of expert judgement can be used to characterise parameter uncertainty. Although numerous methods for eliciting expert opinion as probability distributions exist, there is little research to suggest whether one method is more useful than any other method. This study had three objectives: (i) to obtain subjective probability distributions characterising parameter uncertainty in the context of a health technology assessment; (ii) to compare two elicitation methods by eliciting the same parameters in different ways; (iii) to collect subjective preferences of the experts for the different elicitation methods used. Twenty-seven clinical experts were invited to participate in an elicitation exercise to inform a published model-based cost-effectiveness analysis of alternative treatments for prostate cancer. Participants were individually asked to express their judgements as probability distributions using two different methods - the histogram and hybrid elicitation methods - presented in a random order. Individual distributions were mathematically aggregated across experts with and without weighting. The resulting combined distributions were used in the probabilistic analysis of the decision model and mean incremental cost-effectiveness ratios and the expected values of perfect information (EVPI) were calculated for each method, and compared with the original cost-effectiveness analysis. Scores on the ease of use of the two methods and the extent to which the probability distributions obtained from each method accurately reflected the expert's opinion were also recorded. Six experts completed the task. Mean ICERs from the probabilistic analysis ranged between £162,600-£175,500 per quality-adjusted life year (QALY) depending on the elicitation and weighting methods used. Compared to having no information, use of expert opinion decreased decision uncertainty: the EVPI value at the £30,000 per QALY threshold decreased by 74-86 % from the original cost-effectiveness analysis. Experts indicated that the histogram method was easier to use, but attributed a perception of more accuracy to the hybrid method. Inclusion of expert elicitation can decrease decision uncertainty. Here, choice of method did not affect the overall cost-effectiveness conclusions, but researchers intending to use expert elicitation need to be aware of the impact different methods could have.
Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions.

PubMed

Dinov, Ivo D; Siegrist, Kyle; Pearl, Dennis K; Kalinin, Alexandr; Christou, Nicolas

2016-06-01

Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome , which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols.
Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions

PubMed Central

Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis K.; Kalinin, Alexandr; Christou, Nicolas

2015-01-01

Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome, which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols. PMID:27158191
Maximum-likelihood fitting of data dominated by Poisson statistical uncertainties

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoneking, M.R.; Den Hartog, D.J.

1996-06-01

The fitting of data by {chi}{sup 2}-minimization is valid only when the uncertainties in the data are normally distributed. When analyzing spectroscopic or particle counting data at very low signal level (e.g., a Thomson scattering diagnostic), the uncertainties are distributed with a Poisson distribution. The authors have developed a maximum-likelihood method for fitting data that correctly treats the Poisson statistical character of the uncertainties. This method maximizes the total probability that the observed data are drawn from the assumed fit function using the Poisson probability function to determine the probability for each data point. The algorithm also returns uncertainty estimatesmore » for the fit parameters. They compare this method with a {chi}{sup 2}-minimization routine applied to both simulated and real data. Differences in the returned fits are greater at low signal level (less than {approximately}20 counts per measurement). the maximum-likelihood method is found to be more accurate and robust, returning a narrower distribution of values for the fit parameters with fewer outliers.« less
SU-F-T-450: The Investigation of Radiotherapy Quality Assurance and Automatic Treatment Planning Based On the Kernel Density Estimation Method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fan, J; Fan, J; Hu, W

Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditionalmore » probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.« less
q-Gaussian distributions and multiplicative stochastic processes for analysis of multiple financial time series

NASA Astrophysics Data System (ADS)

Sato, Aki-Hiro

2010-12-01

This study considers q-Gaussian distributions and stochastic differential equations with both multiplicative and additive noises. In the M-dimensional case a q-Gaussian distribution can be theoretically derived as a stationary probability distribution of the multiplicative stochastic differential equation with both mutually independent multiplicative and additive noises. By using the proposed stochastic differential equation a method to evaluate a default probability under a given risk buffer is proposed.

Time-dependent landslide probability mapping

USGS Publications Warehouse

Campbell, Russell H.; Bernknopf, Richard L.; ,

1993-01-01

Case studies where time of failure is known for rainfall-triggered debris flows can be used to estimate the parameters of a hazard model in which the probability of failure is a function of time. As an example, a time-dependent function for the conditional probability of a soil slip is estimated from independent variables representing hillside morphology, approximations of material properties, and the duration and rate of rainfall. If probabilities are calculated in a GIS (geomorphic information system ) environment, the spatial distribution of the result for any given hour can be displayed on a map. Although the probability levels in this example are uncalibrated, the method offers a potential for evaluating different physical models and different earth-science variables by comparing the map distribution of predicted probabilities with inventory maps for different areas and different storms. If linked with spatial and temporal socio-economic variables, this method could be used for short-term risk assessment.
Generalized Wishart Mixtures for Unsupervised Classification of PolSAR Data

NASA Astrophysics Data System (ADS)

Li, Lan; Chen, Erxue; Li, Zengyuan

2013-01-01

This paper presents an unsupervised clustering algorithm based upon the expectation maximization (EM) algorithm for finite mixture modelling, using the complex wishart probability density function (PDF) for the probabilities. The mixture model enables to consider heterogeneous thematic classes which could not be better fitted by the unimodal wishart distribution. In order to make it fast and robust to calculate, we use the recently proposed generalized gamma distribution (GΓD) for the single polarization intensity data to make the initial partition. Then we use the wishart probability density function for the corresponding sample covariance matrix to calculate the posterior class probabilities for each pixel. The posterior class probabilities are used for the prior probability estimates of each class and weights for all class parameter updates. The proposed method is evaluated and compared with the wishart H-Alpha-A classification. Preliminary results show that the proposed method has better performance.
Nuclear risk analysis of the Ulysses mission

NASA Astrophysics Data System (ADS)

Bartram, Bart W.; Vaughan, Frank R.; Englehart, Richard W.

An account is given of the method used to quantify the risks accruing to the use of a radioisotope thermoelectric generator fueled by Pu-238 dioxide aboard the Space Shuttle-launched Ulysses mission. After using a Monte Carlo technique to develop probability distributions for the radiological consequences of a range of accident scenarios throughout the mission, factors affecting those consequences are identified in conjunction with their probability distributions. The functional relationship among all the factors is then established, and probability distributions for all factor effects are combined by means of a Monte Carlo technique.
A new method for estimating the usual intake of episodically-consumed foods with application to their distribution

PubMed Central

Midthune, Douglas; Dodd, Kevin W.; Freedman, Laurence S.; Krebs-Smith, Susan M.; Subar, Amy F.; Guenther, Patricia M.; Carroll, Raymond J.; Kipnis, Victor

2007-01-01

Objective We propose a new statistical method that uses information from two 24-hour recalls (24HRs) to estimate usual intake of episodically-consumed foods. Statistical Analyses Performed The method developed at the National Cancer Institute (NCI) accommodates the large number of non-consumption days that arise with foods by separating the probability of consumption from the consumption-day amount, using a two-part model. Covariates, such as sex, age, race, or information from a food frequency questionnaire (FFQ), may supplement the information from two or more 24HRs using correlated mixed model regression. The model allows for correlation between the probability of consuming a food on a single day and the consumption-day amount. Percentiles of the distribution of usual intake are computed from the estimated model parameters. Results The Eating at America's Table Study (EATS) data are used to illustrate the method to estimate the distribution of usual intake for whole grains and dark green vegetables for men and women and the distribution of usual intakes of whole grains by educational level among men. A simulation study indicates that the NCI method leads to substantial improvement over existing methods for estimating the distribution of usual intake of foods. Applications/Conclusions The NCI method provides distinct advantages over previously proposed methods by accounting for the correlation between probability of consumption and amount consumed and by incorporating covariate information. Researchers interested in estimating the distribution of usual intakes of foods for a population or subpopulation are advised to work with a statistician and incorporate the NCI method in analyses. PMID:17000190
Fourier Method for Calculating Fission Chain Neutron Multiplicity Distributions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chambers, David H.; Chandrasekaran, Hema; Walston, Sean E.

Here, a new way of utilizing the fast Fourier transform is developed to compute the probability distribution for a fission chain to create n neutrons. We then extend this technique to compute the probability distributions for detecting n neutrons. Lastly, our technique can be used for fission chains initiated by either a single neutron inducing a fission or by the spontaneous fission of another isotope.
Fourier Method for Calculating Fission Chain Neutron Multiplicity Distributions

DOE PAGES

Chambers, David H.; Chandrasekaran, Hema; Walston, Sean E.

2017-03-27

Here, a new way of utilizing the fast Fourier transform is developed to compute the probability distribution for a fission chain to create n neutrons. We then extend this technique to compute the probability distributions for detecting n neutrons. Lastly, our technique can be used for fission chains initiated by either a single neutron inducing a fission or by the spontaneous fission of another isotope.
Imprecise Probability Methods for Weapons UQ

DOE Office of Scientific and Technical Information (OSTI.GOV)

Picard, Richard Roy; Vander Wiel, Scott Alan

Building on recent work in uncertainty quanti cation, we examine the use of imprecise probability methods to better characterize expert knowledge and to improve on misleading aspects of Bayesian analysis with informative prior distributions. Quantitative approaches to incorporate uncertainties in weapons certi cation are subject to rigorous external peer review, and in this regard, certain imprecise probability methods are well established in the literature and attractive. These methods are illustrated using experimental data from LANL detonator impact testing.
Exact probability distribution functions for Parrondo's games

NASA Astrophysics Data System (ADS)

Zadourian, Rubina; Saakian, David B.; Klümper, Andreas

2016-12-01

We study the discrete time dynamics of Brownian ratchet models and Parrondo's games. Using the Fourier transform, we calculate the exact probability distribution functions for both the capital dependent and history dependent Parrondo's games. In certain cases we find strong oscillations near the maximum of the probability distribution with two limiting distributions for odd and even number of rounds of the game. Indications of such oscillations first appeared in the analysis of real financial data, but now we have found this phenomenon in model systems and a theoretical understanding of the phenomenon. The method of our work can be applied to Brownian ratchets, molecular motors, and portfolio optimization.
Exact probability distribution functions for Parrondo's games.

PubMed

Zadourian, Rubina; Saakian, David B; Klümper, Andreas

2016-12-01

We study the discrete time dynamics of Brownian ratchet models and Parrondo's games. Using the Fourier transform, we calculate the exact probability distribution functions for both the capital dependent and history dependent Parrondo's games. In certain cases we find strong oscillations near the maximum of the probability distribution with two limiting distributions for odd and even number of rounds of the game. Indications of such oscillations first appeared in the analysis of real financial data, but now we have found this phenomenon in model systems and a theoretical understanding of the phenomenon. The method of our work can be applied to Brownian ratchets, molecular motors, and portfolio optimization.
A Simple Method for Estimating Informative Node Age Priors for the Fossil Calibration of Molecular Divergence Time Analyses

PubMed Central

Nowak, Michael D.; Smith, Andrew B.; Simpson, Carl; Zwickl, Derrick J.

2013-01-01

Molecular divergence time analyses often rely on the age of fossil lineages to calibrate node age estimates. Most divergence time analyses are now performed in a Bayesian framework, where fossil calibrations are incorporated as parametric prior probabilities on node ages. It is widely accepted that an ideal parameterization of such node age prior probabilities should be based on a comprehensive analysis of the fossil record of the clade of interest, but there is currently no generally applicable approach for calculating such informative priors. We provide here a simple and easily implemented method that employs fossil data to estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade, which can be used to fit an informative parametric prior probability distribution on a node age. Specifically, our method uses the extant diversity and the stratigraphic distribution of fossil lineages confidently assigned to a clade to fit a branching model of lineage diversification. Conditioning this on a simple model of fossil preservation, we estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade. The likelihood surface of missing history can then be translated into a parametric prior probability distribution on the age of the clade of interest. We show that the method performs well with simulated fossil distribution data, but that the likelihood surface of missing history can at times be too complex for the distribution-fitting algorithm employed by our software tool. An empirical example of the application of our method is performed to estimate echinoid node ages. A simulation-based sensitivity analysis using the echinoid data set shows that node age prior distributions estimated under poor preservation rates are significantly less informative than those estimated under high preservation rates. PMID:23755303
Option volatility and the acceleration Lagrangian

NASA Astrophysics Data System (ADS)

Baaquie, Belal E.; Cao, Yang

2014-01-01

This paper develops a volatility formula for option on an asset from an acceleration Lagrangian model and the formula is calibrated with market data. The Black-Scholes model is a simpler case that has a velocity dependent Lagrangian. The acceleration Lagrangian is defined, and the classical solution of the system in Euclidean time is solved by choosing proper boundary conditions. The conditional probability distribution of final position given the initial position is obtained from the transition amplitude. The volatility is the standard deviation of the conditional probability distribution. Using the conditional probability and the path integral method, the martingale condition is applied, and one of the parameters in the Lagrangian is fixed. The call option price is obtained using the conditional probability and the path integral method.
Geotechnical parameter spatial distribution stochastic analysis based on multi-precision information assimilation

NASA Astrophysics Data System (ADS)

Wang, C.; Rubin, Y.

2014-12-01

Spatial distribution of important geotechnical parameter named compression modulus Es contributes considerably to the understanding of the underlying geological processes and the adequate assessment of the Es mechanics effects for differential settlement of large continuous structure foundation. These analyses should be derived using an assimilating approach that combines in-situ static cone penetration test (CPT) with borehole experiments. To achieve such a task, the Es distribution of stratum of silty clay in region A of China Expo Center (Shanghai) is studied using the Bayesian-maximum entropy method. This method integrates rigorously and efficiently multi-precision of different geotechnical investigations and sources of uncertainty. Single CPT samplings were modeled as a rational probability density curve by maximum entropy theory. Spatial prior multivariate probability density function (PDF) and likelihood PDF of the CPT positions were built by borehole experiments and the potential value of the prediction point, then, preceding numerical integration on the CPT probability density curves, the posterior probability density curve of the prediction point would be calculated by the Bayesian reverse interpolation framework. The results were compared between Gaussian Sequential Stochastic Simulation and Bayesian methods. The differences were also discussed between single CPT samplings of normal distribution and simulated probability density curve based on maximum entropy theory. It is shown that the study of Es spatial distributions can be improved by properly incorporating CPT sampling variation into interpolation process, whereas more informative estimations are generated by considering CPT Uncertainty for the estimation points. Calculation illustrates the significance of stochastic Es characterization in a stratum, and identifies limitations associated with inadequate geostatistical interpolation techniques. This characterization results will provide a multi-precision information assimilation method of other geotechnical parameters.
Fractional Gaussian model in global optimization

NASA Astrophysics Data System (ADS)

Dimri, V. P.; Srivastava, R. P.

2009-12-01

Earth system is inherently non-linear and it can be characterized well if we incorporate no-linearity in the formulation and solution of the problem. General tool often used for characterization of the earth system is inversion. Traditionally inverse problems are solved using least-square based inversion by linearizing the formulation. The initial model in such inversion schemes is often assumed to follow posterior Gaussian probability distribution. It is now well established that most of the physical properties of the earth follow power law (fractal distribution). Thus, the selection of initial model based on power law probability distribution will provide more realistic solution. We present a new method which can draw samples of posterior probability density function very efficiently using fractal based statistics. The application of the method has been demonstrated to invert band limited seismic data with well control. We used fractal based probability density function which uses mean, variance and Hurst coefficient of the model space to draw initial model. Further this initial model is used in global optimization inversion scheme. Inversion results using initial models generated by our method gives high resolution estimates of the model parameters than the hitherto used gradient based liner inversion method.
Optimizing probability of detection point estimate demonstration

NASA Astrophysics Data System (ADS)

Koshti, Ajay M.

2017-04-01

The paper provides discussion on optimizing probability of detection (POD) demonstration experiments using point estimate method. The optimization is performed to provide acceptable value for probability of passing demonstration (PPD) and achieving acceptable value for probability of false (POF) calls while keeping the flaw sizes in the set as small as possible. POD Point estimate method is used by NASA for qualifying special NDE procedures. The point estimate method uses binomial distribution for probability density. Normally, a set of 29 flaws of same size within some tolerance are used in the demonstration. Traditionally largest flaw size in the set is considered to be a conservative estimate of the flaw size with minimum 90% probability and 95% confidence. The flaw size is denoted as α90/95PE. The paper investigates relationship between range of flaw sizes in relation to α90, i.e. 90% probability flaw size, to provide a desired PPD. The range of flaw sizes is expressed as a proportion of the standard deviation of the probability density distribution. Difference between median or average of the 29 flaws and α90 is also expressed as a proportion of standard deviation of the probability density distribution. In general, it is concluded that, if probability of detection increases with flaw size, average of 29 flaw sizes would always be larger than or equal to α90 and is an acceptable measure of α90/95PE. If NDE technique has sufficient sensitivity and signal-to-noise ratio, then the 29 flaw-set can be optimized to meet requirements of minimum required PPD, maximum allowable POF, requirements on flaw size tolerance about mean flaw size and flaw size detectability requirements. The paper provides procedure for optimizing flaw sizes in the point estimate demonstration flaw-set.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Diwaker, E-mail: diwakerphysics@gmail.com; Chakraborty, Aniruddha

The Smoluchowski equation with a time-dependent sink term is solved exactly. In this method, knowing the probability distribution P(0, s) at the origin, allows deriving the probability distribution P(x, s) at all positions. Exact solutions of the Smoluchowski equation are also provided in different cases where the sink term has linear, constant, inverse, and exponential variation in time.
Methods for Combining Payload Parameter Variations with Input Environment

NASA Technical Reports Server (NTRS)

Merchant, D. H.; Straayer, J. W.

1975-01-01

Methods are presented for calculating design limit loads compatible with probabilistic structural design criteria. The approach is based on the concept that the desired limit load, defined as the largest load occuring in a mission, is a random variable having a specific probability distribution which may be determined from extreme-value theory. The design limit load, defined as a particular value of this random limit load, is the value conventionally used in structural design. Methods are presented for determining the limit load probability distributions from both time-domain and frequency-domain dynamic load simulations. Numerical demonstrations of the methods are also presented.
Shallow slip amplification and enhanced tsunami hazard unravelled by dynamic simulations of mega-thrust earthquakes

PubMed Central

Murphy, S.; Scala, A.; Herrero, A.; Lorito, S.; Festa, G.; Trasatti, E.; Tonini, R.; Romano, F.; Molinari, I.; Nielsen, S.

2016-01-01

The 2011 Tohoku earthquake produced an unexpected large amount of shallow slip greatly contributing to the ensuing tsunami. How frequent are such events? How can they be efficiently modelled for tsunami hazard? Stochastic slip models, which can be computed rapidly, are used to explore the natural slip variability; however, they generally do not deal specifically with shallow slip features. We study the systematic depth-dependence of slip along a thrust fault with a number of 2D dynamic simulations using stochastic shear stress distributions and a geometry based on the cross section of the Tohoku fault. We obtain a probability density for the slip distribution, which varies both with depth, earthquake size and whether the rupture breaks the surface. We propose a method to modify stochastic slip distributions according to this dynamically-derived probability distribution. This method may be efficiently applied to produce large numbers of heterogeneous slip distributions for probabilistic tsunami hazard analysis. Using numerous M9 earthquake scenarios, we demonstrate that incorporating the dynamically-derived probability distribution does enhance the conditional probability of exceedance of maximum estimated tsunami wave heights along the Japanese coast. This technique for integrating dynamic features in stochastic models can be extended to any subduction zone and faulting style. PMID:27725733
Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model

PubMed Central

Mitra, Rajib; Jordan, Michael I.; Dunbrack, Roland L.

2010-01-01

Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp. PMID:20442867
Methods for fitting a parametric probability distribution to most probable number data.

PubMed

Williams, Michael S; Ebel, Eric D

2012-07-02

Every year hundreds of thousands, if not millions, of samples are collected and analyzed to assess microbial contamination in food and water. The concentration of pathogenic organisms at the end of the production process is low for most commodities, so a highly sensitive screening test is used to determine whether the organism of interest is present in a sample. In some applications, samples that test positive are subjected to quantitation. The most probable number (MPN) technique is a common method to quantify the level of contamination in a sample because it is able to provide estimates at low concentrations. This technique uses a series of dilution count experiments to derive estimates of the concentration of the microorganism of interest. An application for these data is food-safety risk assessment, where the MPN concentration estimates can be fitted to a parametric distribution to summarize the range of potential exposures to the contaminant. Many different methods (e.g., substitution methods, maximum likelihood and regression on order statistics) have been proposed to fit microbial contamination data to a distribution, but the development of these methods rarely considers how the MPN technique influences the choice of distribution function and fitting method. An often overlooked aspect when applying these methods is whether the data represent actual measurements of the average concentration of microorganism per milliliter or the data are real-valued estimates of the average concentration, as is the case with MPN data. In this study, we propose two methods for fitting MPN data to a probability distribution. The first method uses a maximum likelihood estimator that takes average concentration values as the data inputs. The second is a Bayesian latent variable method that uses the counts of the number of positive tubes at each dilution to estimate the parameters of the contamination distribution. The performance of the two fitting methods is compared for two data sets that represent Salmonella and Campylobacter concentrations on chicken carcasses. The results demonstrate a bias in the maximum likelihood estimator that increases with reductions in average concentration. The Bayesian method provided unbiased estimates of the concentration distribution parameters for all data sets. We provide computer code for the Bayesian fitting method. Published by Elsevier B.V.
The global impact distribution of Near-Earth objects

NASA Astrophysics Data System (ADS)

Rumpf, Clemens; Lewis, Hugh G.; Atkinson, Peter M.

2016-02-01

Asteroids that could collide with the Earth are listed on the publicly available Near-Earth object (NEO) hazard web sites maintained by the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA). The impact probability distribution of 69 potentially threatening NEOs from these lists that produce 261 dynamically distinct impact instances, or Virtual Impactors (VIs), were calculated using the Asteroid Risk Mitigation and Optimization Research (ARMOR) tool in conjunction with OrbFit. ARMOR projected the impact probability of each VI onto the surface of the Earth as a spatial probability distribution. The projection considers orbit solution accuracy and the global impact probability. The method of ARMOR is introduced and the tool is validated against two asteroid-Earth collision cases with objects 2008 TC3 and 2014 AA. In the analysis, the natural distribution of impact corridors is contrasted against the impact probability distribution to evaluate the distributions' conformity with the uniform impact distribution assumption. The distribution of impact corridors is based on the NEO population and orbital mechanics. The analysis shows that the distribution of impact corridors matches the common assumption of uniform impact distribution and the result extends the evidence base for the uniform assumption from qualitative analysis of historic impact events into the future in a quantitative way. This finding is confirmed in a parallel analysis of impact points belonging to a synthetic population of 10,006 VIs. Taking into account the impact probabilities introduced significant variation into the results and the impact probability distribution, consequently, deviates markedly from uniformity. The concept of impact probabilities is a product of the asteroid observation and orbit determination technique and, thus, represents a man-made component that is largely disconnected from natural processes. It is important to consider impact probabilities because such information represents the best estimate of where an impact might occur.

Surveillance system and method having an adaptive sequential probability fault detection test

NASA Technical Reports Server (NTRS)

Herzog, James P. (Inventor); Bickford, Randall L. (Inventor)

2005-01-01

System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Surveillance system and method having an adaptive sequential probability fault detection test

NASA Technical Reports Server (NTRS)

Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)

2006-01-01

System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Surveillance System and Method having an Adaptive Sequential Probability Fault Detection Test

NASA Technical Reports Server (NTRS)

Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)

2008-01-01

System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.
Estimation of the lower and upper bounds on the probability of failure using subset simulation and random set theory

NASA Astrophysics Data System (ADS)

Alvarez, Diego A.; Uribe, Felipe; Hurtado, Jorge E.

2018-02-01

Random set theory is a general framework which comprises uncertainty in the form of probability boxes, possibility distributions, cumulative distribution functions, Dempster-Shafer structures or intervals; in addition, the dependence between the input variables can be expressed using copulas. In this paper, the lower and upper bounds on the probability of failure are calculated by means of random set theory. In order to accelerate the calculation, a well-known and efficient probability-based reliability method known as subset simulation is employed. This method is especially useful for finding small failure probabilities in both low- and high-dimensional spaces, disjoint failure domains and nonlinear limit state functions. The proposed methodology represents a drastic reduction of the computational labor implied by plain Monte Carlo simulation for problems defined with a mixture of representations for the input variables, while delivering similar results. Numerical examples illustrate the efficiency of the proposed approach.
On the inequivalence of the CH and CHSH inequalities due to finite statistics

NASA Astrophysics Data System (ADS)

Renou, M. O.; Rosset, D.; Martin, A.; Gisin, N.

2017-06-01

Different variants of a Bell inequality, such as CHSH and CH, are known to be equivalent when evaluated on nonsignaling outcome probability distributions. However, in experimental setups, the outcome probability distributions are estimated using a finite number of samples. Therefore the nonsignaling conditions are only approximately satisfied and the robustness of the violation depends on the chosen inequality variant. We explain that phenomenon using the decomposition of the space of outcome probability distributions under the action of the symmetry group of the scenario, and propose a method to optimize the statistical robustness of a Bell inequality. In the process, we describe the finite group composed of relabeling of parties, measurement settings and outcomes, and identify correspondences between the irreducible representations of this group and properties of outcome probability distributions such as normalization, signaling or having uniform marginals.
Does Breast Cancer Drive the Building of Survival Probability Models among States? An Assessment of Goodness of Fit for Patient Data from SEER Registries

PubMed

Khan, Hafiz; Saxena, Anshul; Perisetti, Abhilash; Rafiq, Aamrin; Gabbidon, Kemesha; Mende, Sarah; Lyuksyutova, Maria; Quesada, Kandi; Blakely, Summre; Torres, Tiffany; Afesse, Mahlet

2016-12-01

Background: Breast cancer is a worldwide public health concern and is the most prevalent type of cancer in women in the United States. This study concerned the best fit of statistical probability models on the basis of survival times for nine state cancer registries: California, Connecticut, Georgia, Hawaii, Iowa, Michigan, New Mexico, Utah, and Washington. Materials and Methods: A probability random sampling method was applied to select and extract records of 2,000 breast cancer patients from the Surveillance Epidemiology and End Results (SEER) database for each of the nine state cancer registries used in this study. EasyFit software was utilized to identify the best probability models by using goodness of fit tests, and to estimate parameters for various statistical probability distributions that fit survival data. Results: Statistical analysis for the summary of statistics is reported for each of the states for the years 1973 to 2012. Kolmogorov-Smirnov, Anderson-Darling, and Chi-squared goodness of fit test values were used for survival data, the highest values of goodness of fit statistics being considered indicative of the best fit survival model for each state. Conclusions: It was found that California, Connecticut, Georgia, Iowa, New Mexico, and Washington followed the Burr probability distribution, while the Dagum probability distribution gave the best fit for Michigan and Utah, and Hawaii followed the Gamma probability distribution. These findings highlight differences between states through selected sociodemographic variables and also demonstrate probability modeling differences in breast cancer survival times. The results of this study can be used to guide healthcare providers and researchers for further investigations into social and environmental factors in order to reduce the occurrence of and mortality due to breast cancer. Creative Commons Attribution License
A dynamic programming approach to estimate the capacity value of energy storage

DOE PAGES

Sioshansi, Ramteen; Madaeni, Seyed Hossein; Denholm, Paul

2013-09-17

Here, we present a method to estimate the capacity value of storage. Our method uses a dynamic program to model the effect of power system outages on the operation and state of charge of storage in subsequent periods. We combine the optimized dispatch from the dynamic program with estimated system loss of load probabilities to compute a probability distribution for the state of charge of storage in each period. This probability distribution can be used as a forced outage rate for storage in standard reliability-based capacity value estimation methods. Our proposed method has the advantage over existing approximations that itmore » explicitly captures the effect of system shortage events on the state of charge of storage in subsequent periods. We also use a numerical case study, based on five utility systems in the U.S., to demonstrate our technique and compare it to existing approximation methods.« less
Tsunami Size Distributions at Far-Field Locations from Aggregated Earthquake Sources

NASA Astrophysics Data System (ADS)

Geist, E. L.; Parsons, T.

2015-12-01

The distribution of tsunami amplitudes at far-field tide gauge stations is explained by aggregating the probability of tsunamis derived from individual subduction zones and scaled by their seismic moment. The observed tsunami amplitude distributions of both continental (e.g., San Francisco) and island (e.g., Hilo) stations distant from subduction zones are examined. Although the observed probability distributions nominally follow a Pareto (power-law) distribution, there are significant deviations. Some stations exhibit varying degrees of tapering of the distribution at high amplitudes and, in the case of the Hilo station, there is a prominent break in slope on log-log probability plots. There are also differences in the slopes of the observed distributions among stations that can be significant. To explain these differences we first estimate seismic moment distributions of observed earthquakes for major subduction zones. Second, regression models are developed that relate the tsunami amplitude at a station to seismic moment at a subduction zone, correcting for epicentral distance. The seismic moment distribution is then transformed to a site-specific tsunami amplitude distribution using the regression model. Finally, a mixture distribution is developed, aggregating the transformed tsunami distributions from all relevant subduction zones. This mixture distribution is compared to the observed distribution to assess the performance of the method described above. This method allows us to estimate the largest tsunami that can be expected in a given time period at a station.
A tool for simulating collision probabilities of animals with marine renewable energy devices.

PubMed

Schmitt, Pál; Culloch, Ross; Lieber, Lilian; Molander, Sverker; Hammar, Linus; Kregting, Louise

2017-01-01

The mathematical problem of establishing a collision probability distribution is often not trivial. The shape and motion of the animal as well as of the the device must be evaluated in a four-dimensional space (3D motion over time). Earlier work on wind and tidal turbines was limited to a simplified two-dimensional representation, which cannot be applied to many new structures. We present a numerical algorithm to obtain such probability distributions using transient, three-dimensional numerical simulations. The method is demonstrated using a sub-surface tidal kite as an example. Necessary pre- and post-processing of the data created by the model is explained, numerical details and potential issues and limitations in the application of resulting probability distributions are highlighted.
Specifying the Probability Characteristics of Funnel Plot Control Limits: An Investigation of Three Approaches

PubMed Central

Manktelow, Bradley N.; Seaton, Sarah E.

2012-01-01

Background Emphasis is increasingly being placed on the monitoring and comparison of clinical outcomes between healthcare providers. Funnel plots have become a standard graphical methodology to identify outliers and comprise plotting an outcome summary statistic from each provider against a specified ‘target’ together with upper and lower control limits. With discrete probability distributions it is not possible to specify the exact probability that an observation from an ‘in-control’ provider will fall outside the control limits. However, general probability characteristics can be set and specified using interpolation methods. Guidelines recommend that providers falling outside such control limits should be investigated, potentially with significant consequences, so it is important that the properties of the limits are understood. Methods Control limits for funnel plots for the Standardised Mortality Ratio (SMR) based on the Poisson distribution were calculated using three proposed interpolation methods and the probability calculated of an ‘in-control’ provider falling outside of the limits. Examples using published data were shown to demonstrate the potential differences in the identification of outliers. Results The first interpolation method ensured that the probability of an observation of an ‘in control’ provider falling outside either limit was always less than a specified nominal probability (p). The second method resulted in such an observation falling outside either limit with a probability that could be either greater or less than p, depending on the expected number of events. The third method led to a probability that was always greater than, or equal to, p. Conclusion The use of different interpolation methods can lead to differences in the identification of outliers. This is particularly important when the expected number of events is small. We recommend that users of these methods be aware of the differences, and specify which interpolation method is to be used prior to any analysis. PMID:23029202
Modeling the probability distribution of peak discharge for infiltrating hillslopes

NASA Astrophysics Data System (ADS)

Baiamonte, Giorgio; Singh, Vijay P.

2017-07-01

Hillslope response plays a fundamental role in the prediction of peak discharge at the basin outlet. The peak discharge for the critical duration of rainfall and its probability distribution are needed for designing urban infrastructure facilities. This study derives the probability distribution, denoted as GABS model, by coupling three models: (1) the Green-Ampt model for computing infiltration, (2) the kinematic wave model for computing discharge hydrograph from the hillslope, and (3) the intensity-duration-frequency (IDF) model for computing design rainfall intensity. The Hortonian mechanism for runoff generation is employed for computing the surface runoff hydrograph. Since the antecedent soil moisture condition (ASMC) significantly affects the rate of infiltration, its effect on the probability distribution of peak discharge is investigated. Application to a watershed in Sicily, Italy, shows that with the increase of probability, the expected effect of ASMC to increase the maximum discharge diminishes. Only for low values of probability, the critical duration of rainfall is influenced by ASMC, whereas its effect on the peak discharge seems to be less for any probability. For a set of parameters, the derived probability distribution of peak discharge seems to be fitted by the gamma distribution well. Finally, an application to a small watershed, with the aim to test the possibility to arrange in advance the rational runoff coefficient tables to be used for the rational method, and a comparison between peak discharges obtained by the GABS model with those measured in an experimental flume for a loamy-sand soil were carried out.
Nuclear risk analysis of the Ulysses mission

NASA Astrophysics Data System (ADS)

Bartram, Bart W.; Vaughan, Frank R.; Englehart, Richard W., Dr.

1991-01-01

The use of a radioisotope thermoelectric generator fueled with plutonium-238 dioxide on the Space Shuttle-launched Ulysses mission implies some level of risk due to potential accidents. This paper describes the method used to quantify risks in the Ulysses mission Final Safety Analysis Report prepared for the U.S. Department of Energy. The starting point for the analysis described herein is following input of source term probability distributions from the General Electric Company. A Monte Carlo technique is used to develop probability distributions of radiological consequences for a range of accident scenarios thoughout the mission. Factors affecting radiological consequences are identified, the probability distribution of the effect of each factor determined, and the functional relationship among all the factors established. The probability distributions of all the factor effects are then combined using a Monte Carlo technique. The results of the analysis are presented in terms of complementary cumulative distribution functions (CCDF) by mission sub-phase, phase, and the overall mission. The CCDFs show the total probability that consequences (calculated health effects) would be equal to or greater than a given value.
Ensemble Kalman filtering in presence of inequality constraints

NASA Astrophysics Data System (ADS)

van Leeuwen, P. J.

2009-04-01

Kalman filtering is presence of constraints is an active area of research. Based on the Gaussian assumption for the probability-density functions, it looks hard to bring in extra constraints in the formalism. On the other hand, in geophysical systems we often encounter constraints related to e.g. the underlying physics or chemistry, which are violated by the Gaussian assumption. For instance, concentrations are always non-negative, model layers have non-negative thickness, and sea-ice concentration is between 0 and 1. Several methods to bring inequality constraints into the Kalman-filter formalism have been proposed. One of them is probability density function (pdf) truncation, in which the Gaussian mass from the non-allowed part of the variables is just equally distributed over the pdf where the variables are alolwed, as proposed by Shimada et al. 1998. However, a problem with this method is that the probability that e.g. the sea-ice concentration is zero, is zero! The new method proposed here does not have this drawback. It assumes that the probability-density function is a truncated Gaussian, but the truncated mass is not distributed equally over all allowed values of the variables, but put into a delta distribution at the truncation point. This delta distribution can easily be handled with in Bayes theorem, leading to posterior probability density functions that are also truncated Gaussians with delta distributions at the truncation location. In this way a much better representation of the system is obtained, while still keeping most of the benefits of the Kalman-filter formalism. In the full Kalman filter the formalism is prohibitively expensive in large-scale systems, but efficient implementation is possible in ensemble variants of the kalman filter. Applications to low-dimensional systems and large-scale systems will be discussed.
Noise deconvolution based on the L1-metric and decomposition of discrete distributions of postsynaptic responses.

PubMed

Astrelin, A V; Sokolov, M V; Behnisch, T; Reymann, K G; Voronin, L L

1997-04-25

A statistical approach to analysis of amplitude fluctuations of postsynaptic responses is described. This includes (1) using a L1-metric in the space of distribution functions for minimisation with application of linear programming methods to decompose amplitude distributions into a convolution of Gaussian and discrete distributions; (2) deconvolution of the resulting discrete distribution with determination of the release probabilities and the quantal amplitude for cases with a small number (< 5) of discrete components. The methods were tested against simulated data over a range of sample sizes and signal-to-noise ratios which mimicked those observed in physiological experiments. In computer simulation experiments, comparisons were made with other methods of 'unconstrained' (generalized) and constrained reconstruction of discrete components from convolutions. The simulation results provided additional criteria for improving the solutions to overcome 'over-fitting phenomena' and to constrain the number of components with small probabilities. Application of the programme to recordings from hippocampal neurones demonstrated its usefulness for the analysis of amplitude distributions of postsynaptic responses.
Development and application of a probability distribution retrieval scheme to the remote sensing of clouds and precipitation

NASA Astrophysics Data System (ADS)

McKague, Darren Shawn

2001-12-01

The statistical properties of clouds and precipitation on a global scale are important to our understanding of climate. Inversion methods exist to retrieve the needed cloud and precipitation properties from satellite data pixel-by-pixel that can then be summarized over large data sets to obtain the desired statistics. These methods can be quite computationally expensive, and typically don't provide errors on the statistics. A new method is developed to directly retrieve probability distributions of parameters from the distribution of measured radiances. The method also provides estimates of the errors on the retrieved distributions. The method can retrieve joint distributions of parameters that allows for the study of the connection between parameters. A forward radiative transfer model creates a mapping from retrieval parameter space to radiance space. A Monte Carlo procedure uses the mapping to transform probability density from the observed radiance histogram to a two- dimensional retrieval property probability distribution function (PDF). An estimate of the uncertainty in the retrieved PDF is calculated from random realizations of the radiance to retrieval parameter PDF transformation given the uncertainty of the observed radiances, the radiance PDF, the forward radiative transfer, the finite number of prior state vectors, and the non-unique mapping to retrieval parameter space. The retrieval method is also applied to the remote sensing of precipitation from SSM/I microwave data. A method of stochastically generating hydrometeor fields based on the fields from a numerical cloud model is used to create the precipitation parameter radiance space transformation. The impact of vertical and horizontal variability within the hydrometeor fields has a significant impact on algorithm performance. Beamfilling factors are computed from the simulated hydrometeor fields. The beamfilling factors vary quite a bit depending upon the horizontal structure of the rain. The algorithm is applied to SSM/I images from the eastern tropical Pacific and is compared to PDFs of rain rate computed using pixel-by-pixel retrievals from Wilheit and from Liu and Curry. Differences exist between the three methods, but good general agreement is seen between the PDF retrieval algorithm and the algorithm of Liu and Curry. (Abstract shortened by UMI.)
Optimizing Probability of Detection Point Estimate Demonstration

NASA Technical Reports Server (NTRS)

Koshti, Ajay M.

2017-01-01

Probability of detection (POD) analysis is used in assessing reliably detectable flaw size in nondestructive evaluation (NDE). MIL-HDBK-18231and associated mh18232POD software gives most common methods of POD analysis. Real flaws such as cracks and crack-like flaws are desired to be detected using these NDE methods. A reliably detectable crack size is required for safe life analysis of fracture critical parts. The paper provides discussion on optimizing probability of detection (POD) demonstration experiments using Point Estimate Method. POD Point estimate method is used by NASA for qualifying special NDE procedures. The point estimate method uses binomial distribution for probability density. Normally, a set of 29 flaws of same size within some tolerance are used in the demonstration. The optimization is performed to provide acceptable value for probability of passing demonstration (PPD) and achieving acceptable value for probability of false (POF) calls while keeping the flaw sizes in the set as small as possible.
Comparing the ISO-recommended and the cumulative data-reduction algorithms in S-on-1 laser damage test by a reverse approach method

NASA Astrophysics Data System (ADS)

Zorila, Alexandru; Stratan, Aurel; Nemes, George

2018-01-01

We compare the ISO-recommended (the standard) data-reduction algorithm used to determine the surface laser-induced damage threshold of optical materials by the S-on-1 test with two newly suggested algorithms, both named "cumulative" algorithms/methods, a regular one and a limit-case one, intended to perform in some respects better than the standard one. To avoid additional errors due to real experiments, a simulated test is performed, named the reverse approach. This approach simulates the real damage experiments, by generating artificial test-data of damaged and non-damaged sites, based on an assumed, known damage threshold fluence of the target and on a given probability distribution function to induce the damage. In this work, a database of 12 sets of test-data containing both damaged and non-damaged sites was generated by using four different reverse techniques and by assuming three specific damage probability distribution functions. The same value for the threshold fluence was assumed, and a Gaussian fluence distribution on each irradiated site was considered, as usual for the S-on-1 test. Each of the test-data was independently processed by the standard and by the two cumulative data-reduction algorithms, the resulting fitted probability distributions were compared with the initially assumed probability distribution functions, and the quantities used to compare these algorithms were determined. These quantities characterize the accuracy and the precision in determining the damage threshold and the goodness of fit of the damage probability curves. The results indicate that the accuracy in determining the absolute damage threshold is best for the ISO-recommended method, the precision is best for the limit-case of the cumulative method, and the goodness of fit estimator (adjusted R-squared) is almost the same for all three algorithms.
Dose-volume histogram prediction using density estimation.

PubMed

Skarpman Munter, Johanna; Sjölund, Jens

2015-09-07

Knowledge of what dose-volume histograms can be expected for a previously unseen patient could increase consistency and quality in radiotherapy treatment planning. We propose a machine learning method that uses previous treatment plans to predict such dose-volume histograms. The key to the approach is the framing of dose-volume histograms in a probabilistic setting.The training consists of estimating, from the patients in the training set, the joint probability distribution of some predictive features and the dose. The joint distribution immediately provides an estimate of the conditional probability of the dose given the values of the predictive features. The prediction consists of estimating, from the new patient, the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimate of the dose-volume histogram.To illustrate how the proposed method relates to previously proposed methods, we use the signed distance to the target boundary as a single predictive feature. As a proof-of-concept, we predicted dose-volume histograms for the brainstems of 22 acoustic schwannoma patients treated with stereotactic radiosurgery, and for the lungs of 9 lung cancer patients treated with stereotactic body radiation therapy. Comparing with two previous attempts at dose-volume histogram prediction we find that, given the same input data, the predictions are similar.In summary, we propose a method for dose-volume histogram prediction that exploits the intrinsic probabilistic properties of dose-volume histograms. We argue that the proposed method makes up for some deficiencies in previously proposed methods, thereby potentially increasing ease of use, flexibility and ability to perform well with small amounts of training data.
Net present value probability distributions from decline curve reserves estimates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simpson, D.E.; Huffman, C.H.; Thompson, R.S.

1995-12-31

This paper demonstrates how reserves probability distributions can be used to develop net present value (NPV) distributions. NPV probability distributions were developed from the rate and reserves distributions presented in SPE 28333. This real data study used practicing engineer`s evaluations of production histories. Two approaches were examined to quantify portfolio risk. The first approach, the NPV Relative Risk Plot, compares the mean NPV with the NPV relative risk ratio for the portfolio. The relative risk ratio is the NPV standard deviation (a) divided the mean ({mu}) NPV. The second approach, a Risk - Return Plot, is a plot of themore » {mu} discounted cash flow rate of return (DCFROR) versus the {sigma} for the DCFROR distribution. This plot provides a risk-return relationship for comparing various portfolios. These methods may help evaluate property acquisition and divestiture alternatives and assess the relative risk of a suite of wells or fields for bank loans.« less
An efficient multi-objective optimization method for water quality sensor placement within water distribution systems considering contamination probability variations.

PubMed

He, Guilin; Zhang, Tuqiao; Zheng, Feifei; Zhang, Qingzhou

2018-06-20

Water quality security within water distribution systems (WDSs) has been an important issue due to their inherent vulnerability associated with contamination intrusion. This motivates intensive studies to identify optimal water quality sensor placement (WQSP) strategies, aimed to timely/effectively detect (un)intentional intrusion events. However, these available WQSP optimization methods have consistently presumed that each WDS node has an equal contamination probability. While being simple in implementation, this assumption may do not conform to the fact that the nodal contamination probability may be significantly regionally varied owing to variations in population density and user properties. Furthermore, the low computational efficiency is another important factor that has seriously hampered the practical applications of the currently available WQSP optimization approaches. To address these two issues, this paper proposes an efficient multi-objective WQSP optimization method to explicitly account for contamination probability variations. Four different contamination probability functions (CPFs) are proposed to represent the potential variations of nodal contamination probabilities within the WDS. Two real-world WDSs are used to demonstrate the utility of the proposed method. Results show that WQSP strategies can be significantly affected by the choice of the CPF. For example, when the proposed method is applied to the large case study with the CPF accounting for user properties, the event detection probabilities of the resultant solutions are approximately 65%, while these values are around 25% for the traditional approach, and such design solutions are achieved approximately 10,000 times faster than the traditional method. This paper provides an alternative method to identify optimal WQSP solutions for the WDS, and also builds knowledge regarding the impacts of different CPFs on sensor deployments. Copyright © 2018 Elsevier Ltd. All rights reserved.

Spatial Probability Distribution of Strata's Lithofacies and its Impacts on Land Subsidence in Huairou Emergency Water Resources Region of Beijing

NASA Astrophysics Data System (ADS)

Li, Y.; Gong, H.; Zhu, L.; Guo, L.; Gao, M.; Zhou, C.

2016-12-01

Continuous over-exploitation of groundwater causes dramatic drawdown, and leads to regional land subsidence in the Huairou Emergency Water Resources region, which is located in the up-middle part of the Chaobai river basin of Beijing. Owing to the spatial heterogeneity of strata's lithofacies of the alluvial fan, ground deformation has no significant positive correlation with groundwater drawdown, and one of the challenges ahead is to quantify the spatial distribution of strata's lithofacies. The transition probability geostatistics approach provides potential for characterizing the distribution of heterogeneous lithofacies in the subsurface. Combined the thickness of clay layer extracted from the simulation, with deformation field acquired from PS-InSAR technology, the influence of strata's lithofacies on land subsidence can be analyzed quantitatively. The strata's lithofacies derived from borehole data were generalized into four categories and their probability distribution in the observe space was mined by using the transition probability geostatistics, of which clay was the predominant compressible material. Geologically plausible realizations of lithofacies distribution were produced, accounting for complex heterogeneity in alluvial plain. At a particular probability level of more than 40 percent, the volume of clay defined was 55 percent of the total volume of strata's lithofacies. This level, equaling nearly the volume of compressible clay derived from the geostatistics, was thus chosen to represent the boundary between compressible and uncompressible material. The method incorporates statistical geological information, such as distribution proportions, average lengths and juxtaposition tendencies of geological types, mainly derived from borehole data and expert knowledge, into the Markov chain model of transition probability. Some similarities of patterns were indicated between the spatial distribution of deformation field and clay layer. In the area with roughly similar water table decline, locations in the subsurface having a higher probability for the existence of compressible material occur more than that in the location with a lower probability. Such estimate of spatial probability distribution is useful to analyze the uncertainty of land subsidence.
Probability analysis for consecutive-day maximum rainfall for Tiruchirapalli City (south India, Asia)

NASA Astrophysics Data System (ADS)

Sabarish, R. Mani; Narasimhan, R.; Chandhru, A. R.; Suribabu, C. R.; Sudharsan, J.; Nithiyanantham, S.

2017-05-01

In the design of irrigation and other hydraulic structures, evaluating the magnitude of extreme rainfall for a specific probability of occurrence is of much importance. The capacity of such structures is usually designed to cater to the probability of occurrence of extreme rainfall during its lifetime. In this study, an extreme value analysis of rainfall for Tiruchirapalli City in Tamil Nadu was carried out using 100 years of rainfall data. Statistical methods were used in the analysis. The best-fit probability distribution was evaluated for 1, 2, 3, 4 and 5 days of continuous maximum rainfall. The goodness of fit was evaluated using Chi-square test. The results of the goodness-of-fit tests indicate that log-Pearson type III method is the overall best-fit probability distribution for 1-day maximum rainfall and consecutive 2-, 3-, 4-, 5- and 6-day maximum rainfall series of Tiruchirapalli. To be reliable, the forecasted maximum rainfalls for the selected return periods are evaluated in comparison with the results of the plotting position.
Methods for combining payload parameter variations with input environment. [calculating design limit loads compatible with probabilistic structural design criteria

NASA Technical Reports Server (NTRS)

Merchant, D. H.

1976-01-01

Methods are presented for calculating design limit loads compatible with probabilistic structural design criteria. The approach is based on the concept that the desired limit load, defined as the largest load occurring in a mission, is a random variable having a specific probability distribution which may be determined from extreme-value theory. The design limit load, defined as a particular of this random limit load, is the value conventionally used in structural design. Methods are presented for determining the limit load probability distributions from both time-domain and frequency-domain dynamic load simulations. Numerical demonstrations of the method are also presented.
Estimating probable flaw distributions in PWR steam generator tubes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gorman, J.A.; Turner, A.P.L.

1997-02-01

This paper describes methods for estimating the number and size distributions of flaws of various types in PWR steam generator tubes. These estimates are needed when calculating the probable primary to secondary leakage through steam generator tubes under postulated accidents such as severe core accidents and steam line breaks. The paper describes methods for two types of predictions: (1) the numbers of tubes with detectable flaws of various types as a function of time, and (2) the distributions in size of these flaws. Results are provided for hypothetical severely affected, moderately affected and lightly affected units. Discussion is provided regardingmore » uncertainties and assumptions in the data and analyses.« less
Methods and results of peak-flow frequency analyses for streamgages in and bordering Minnesota, through water year 2011

USGS Publications Warehouse

Kessler, Erich W.; Lorenz, David L.; Sanocki, Christopher A.

2013-01-01

Peak-flow frequency analyses were completed for 409 streamgages in and bordering Minnesota having at least 10 systematic peak flows through water year 2011. Selected annual exceedance probabilities were determined by fitting a log-Pearson type III probability distribution to the recorded annual peak flows. A detailed explanation of the methods that were used to determine the annual exceedance probabilities, the historical period, acceptable low outliers, and analysis method for each streamgage are presented. The final results of the analyses are presented.
Theoretical cratering rates on Ida, Mathilde, Eros and Gaspra

NASA Astrophysics Data System (ADS)

Jeffers, S. V.; Asher, D. J.; Bailey, M. E.

2002-11-01

We investigate the main influences on crater size distributions, by deriving results for the four example target objects, (951) Gaspra, (243) Ida, (253) Mathilde and (433) Eros. The dynamical history of each of these asteroids is modelled using the MERCURY (Chambers 1999) numerical integrator. The use of an efficient, Öpik-type, collision code enables the calculation of a velocity histogram and the probability of impact. This when combined with a crater scaling law and an impactor size distribution, through a Monte Carlo method, results in a crater size distribution. The resulting crater probability distributions are in good agreement with observed crater distributions on these asteroids.
Polynomial chaos representation of databases on manifolds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu

2017-04-15

Characterizing the polynomial chaos expansion (PCE) of a vector-valued random variable with probability distribution concentrated on a manifold is a relevant problem in data-driven settings. The probability distribution of such random vectors is multimodal in general, leading to potentially very slow convergence of the PCE. In this paper, we build on a recent development for estimating and sampling from probabilities concentrated on a diffusion manifold. The proposed methodology constructs a PCE of the random vector together with an associated generator that samples from the target probability distribution which is estimated from data concentrated in the neighborhood of the manifold. Themore » method is robust and remains efficient for high dimension and large datasets. The resulting polynomial chaos construction on manifolds permits the adaptation of many uncertainty quantification and statistical tools to emerging questions motivated by data-driven queries.« less
Convergence of Transition Probability Matrix in CLVMarkov Models

NASA Astrophysics Data System (ADS)

Permana, D.; Pasaribu, U. S.; Indratno, S. W.; Suprayogi, S.

2018-04-01

A transition probability matrix is an arrangement of transition probability from one states to another in a Markov chain model (MCM). One of interesting study on the MCM is its behavior for a long time in the future. The behavior is derived from one property of transition probabilty matrix for n steps. This term is called the convergence of the n-step transition matrix for n move to infinity. Mathematically, the convergence of the transition probability matrix is finding the limit of the transition matrix which is powered by n where n moves to infinity. The convergence form of the transition probability matrix is very interesting as it will bring the matrix to its stationary form. This form is useful for predicting the probability of transitions between states in the future. The method usually used to find the convergence of transition probability matrix is through the process of limiting the distribution. In this paper, the convergence of the transition probability matrix is searched using a simple concept of linear algebra that is by diagonalizing the matrix.This method has a higher level of complexity because it has to perform the process of diagonalization in its matrix. But this way has the advantage of obtaining a common form of power n of the transition probability matrix. This form is useful to see transition matrix before stationary. For example cases are taken from CLV model using MCM called Model of CLV-Markov. There are several models taken by its transition probability matrix to find its convergence form. The result is that the convergence of the matrix of transition probability through diagonalization has similarity with convergence with commonly used distribution of probability limiting method.
Competing risk models in reliability systems, an exponential distribution model with Bayesian analysis approach

NASA Astrophysics Data System (ADS)

Iskandar, I.

2018-03-01

The exponential distribution is the most widely used reliability analysis. This distribution is very suitable for representing the lengths of life of many cases and is available in a simple statistical form. The characteristic of this distribution is a constant hazard rate. The exponential distribution is the lower rank of the Weibull distributions. In this paper our effort is to introduce the basic notions that constitute an exponential competing risks model in reliability analysis using Bayesian analysis approach and presenting their analytic methods. The cases are limited to the models with independent causes of failure. A non-informative prior distribution is used in our analysis. This model describes the likelihood function and follows with the description of the posterior function and the estimations of the point, interval, hazard function, and reliability. The net probability of failure if only one specific risk is present, crude probability of failure due to a specific risk in the presence of other causes, and partial crude probabilities are also included.
High throughput nonparametric probability density estimation.

PubMed

Farmer, Jenny; Jacobs, Donald

2018-01-01

In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
High throughput nonparametric probability density estimation

PubMed Central

Farmer, Jenny

2018-01-01

In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803
Sampling considerations for disease surveillance in wildlife populations

USGS Publications Warehouse

Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

2008-01-01

Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
Quantile Functions, Convergence in Quantile, and Extreme Value Distribution Theory.

DTIC Science & Technology

1980-11-01

Gnanadesikan (1968). Quantile functions are advocated by Parzen (1979) as providing an approach to probability-based data analysis. Quantile functions are... Gnanadesikan , R. (1968). Probability Plotting Methods for the Analysis of Data, Biomtrika, 55, 1-17.
Correlation between discrete probability and reaction front propagation rate in heterogeneous mixtures

NASA Astrophysics Data System (ADS)

Naine, Tarun Bharath; Gundawar, Manoj Kumar

2017-09-01

We demonstrate a very powerful correlation between the discrete probability of distances of neighboring cells and thermal wave propagation rate, for a system of cells spread on a one-dimensional chain. A gamma distribution is employed to model the distances of neighboring cells. In the absence of an analytical solution and the differences in ignition times of adjacent reaction cells following non-Markovian statistics, invariably the solution for thermal wave propagation rate for a one-dimensional system with randomly distributed cells is obtained by numerical simulations. However, such simulations which are based on Monte-Carlo methods require several iterations of calculations for different realizations of distribution of adjacent cells. For several one-dimensional systems, differing in the value of shaping parameter of the gamma distribution, we show that the average reaction front propagation rates obtained by a discrete probability between two limits, shows excellent agreement with those obtained numerically. With the upper limit at 1.3, the lower limit depends on the non-dimensional ignition temperature. Additionally, this approach also facilitates the prediction of burning limits of heterogeneous thermal mixtures. The proposed method completely eliminates the need for laborious, time intensive numerical calculations where the thermal wave propagation rates can now be calculated based only on macroscopic entity of discrete probability.
Method for localizing and isolating an errant process step

DOEpatents

Tobin, Jr., Kenneth W.; Karnowski, Thomas P.; Ferrell, Regina K.

2003-01-01

A method for localizing and isolating an errant process includes the steps of retrieving from a defect image database a selection of images each image having image content similar to image content extracted from a query image depicting a defect, each image in the selection having corresponding defect characterization data. A conditional probability distribution of the defect having occurred in a particular process step is derived from the defect characterization data. A process step as a highest probable source of the defect according to the derived conditional probability distribution is then identified. A method for process step defect identification includes the steps of characterizing anomalies in a product, the anomalies detected by an imaging system. A query image of a product defect is then acquired. A particular characterized anomaly is then correlated with the query image. An errant process step is then associated with the correlated image.
Bayesian soft X-ray tomography using non-stationary Gaussian Processes

NASA Astrophysics Data System (ADS)

Li, Dong; Svensson, J.; Thomsen, H.; Medina, F.; Werner, A.; Wolf, R.

2013-08-01

In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.
Bayesian soft X-ray tomography using non-stationary Gaussian Processes.

PubMed

Li, Dong; Svensson, J; Thomsen, H; Medina, F; Werner, A; Wolf, R

2013-08-01

In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Hang, E-mail: hangchen@mit.edu; Thill, Peter; Cao, Jianshu

In biochemical systems, intrinsic noise may drive the system switch from one stable state to another. We investigate how kinetic switching between stable states in a bistable network is influenced by dynamic disorder, i.e., fluctuations in the rate coefficients. Using the geometric minimum action method, we first investigate the optimal transition paths and the corresponding minimum actions based on a genetic toggle switch model in which reaction coefficients draw from a discrete probability distribution. For the continuous probability distribution of the rate coefficient, we then consider two models of dynamic disorder in which reaction coefficients undergo different stochastic processes withmore » the same stationary distribution. In one, the kinetic parameters follow a discrete Markov process and in the other they follow continuous Langevin dynamics. We find that regulation of the parameters modulating the dynamic disorder, as has been demonstrated to occur through allosteric control in bistable networks in the immune system, can be crucial in shaping the statistics of optimal transition paths, transition probabilities, and the stationary probability distribution of the network.« less
A moment-convergence method for stochastic analysis of biochemical reaction networks.

PubMed

Zhang, Jiajun; Nie, Qing; Zhou, Tianshou

2016-05-21

Traditional moment-closure methods need to assume that high-order cumulants of a probability distribution approximate to zero. However, this strong assumption is not satisfied for many biochemical reaction networks. Here, we introduce convergent moments (defined in mathematics as the coefficients in the Taylor expansion of the probability-generating function at some point) to overcome this drawback of the moment-closure methods. As such, we develop a new analysis method for stochastic chemical kinetics. This method provides an accurate approximation for the master probability equation (MPE). In particular, the connection between low-order convergent moments and rate constants can be more easily derived in terms of explicit and analytical forms, allowing insights that would be difficult to obtain through direct simulation or manipulation of the MPE. In addition, it provides an accurate and efficient way to compute steady-state or transient probability distribution, avoiding the algorithmic difficulty associated with stiffness of the MPE due to large differences in sizes of rate constants. Applications of the method to several systems reveal nontrivial stochastic mechanisms of gene expression dynamics, e.g., intrinsic fluctuations can induce transient bimodality and amplify transient signals, and slow switching between promoter states can increase fluctuations in spatially heterogeneous signals. The overall approach has broad applications in modeling, analysis, and computation of complex biochemical networks with intrinsic noise.
Dynamical complexity changes during two forms of meditation

NASA Astrophysics Data System (ADS)

Li, Jin; Hu, Jing; Zhang, Yinhong; Zhang, Xiaofeng

2011-06-01

Detection of dynamical complexity changes in natural and man-made systems has deep scientific and practical meaning. We use the base-scale entropy method to analyze dynamical complexity changes for heart rate variability (HRV) series during specific traditional forms of Chinese Chi and Kundalini Yoga meditation techniques in healthy young adults. The results show that dynamical complexity decreases in meditation states for two forms of meditation. Meanwhile, we detected changes in probability distribution of m-words during meditation and explained this changes using probability distribution of sine function. The base-scale entropy method may be used on a wider range of physiologic signals.

Using the Pearson Distribution for Synthesis of the Suboptimal Algorithms for Filtering Multi-Dimensional Markov Processes

NASA Astrophysics Data System (ADS)

Mit'kin, A. S.; Pogorelov, V. A.; Chub, E. G.

2015-08-01

We consider the method of constructing the suboptimal filter on the basis of approximating the a posteriori probability density of the multidimensional Markov process by the Pearson distributions. The proposed method can efficiently be used for approximating asymmetric, excessive, and finite densities.
Tree Biomass Estimation of Chinese fir (Cunninghamia lanceolata) Based on Bayesian Method

PubMed Central

Zhang, Jianguo

2013-01-01

Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass. PMID:24278198
Tree biomass estimation of Chinese fir (Cunninghamia lanceolata) based on Bayesian method.

PubMed

Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo

2013-01-01

Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation W = a(D2H)b was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.
Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions

PubMed Central

Marinelli, Fabrizio; Faraldo-Gómez, José D.

2015-01-01

We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines. PMID:26083917
Characterising RNA secondary structure space using information entropy

PubMed Central

2013-01-01

Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/. PMID:23368905
Bayesian statistics and Monte Carlo methods

NASA Astrophysics Data System (ADS)

Koch, K. R.

2018-03-01

The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.
[Establishment of the mathematic model of total quantum statistical moment standard similarity for application to medical theoretical research].

PubMed

He, Fu-yuan; Deng, Kai-wen; Huang, Sheng; Liu, Wen-long; Shi, Ji-lian

2013-09-01

The paper aims to elucidate and establish a new mathematic model: the total quantum statistical moment standard similarity (TQSMSS) on the base of the original total quantum statistical moment model and to illustrate the application of the model to medical theoretical research. The model was established combined with the statistical moment principle and the normal distribution probability density function properties, then validated and illustrated by the pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical method for them, and by analysis of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving the Buyanghanwu-decoction extract. The established model consists of four mainly parameters: (1) total quantum statistical moment similarity as ST, an overlapped area by two normal distribution probability density curves in conversion of the two TQSM parameters; (2) total variability as DT, a confidence limit of standard normal accumulation probability which is equal to the absolute difference value between the two normal accumulation probabilities within integration of their curve nodical; (3) total variable probability as 1-Ss, standard normal distribution probability within interval of D(T); (4) total variable probability (1-beta)alpha and (5) stable confident probability beta(1-alpha): the correct probability to make positive and negative conclusions under confident coefficient alpha. With the model, we had analyzed the TQSMS similarities of pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical methods for them were at range of 0.3852-0.9875 that illuminated different pharmacokinetic behaviors of each other; and the TQSMS similarities (ST) of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving Buyanghuanwu-decoction-extract were at range of 0.6842-0.999 2 that showed different constituents with various solvent extracts. The TQSMSS can characterize the sample similarity, by which we can quantitate the correct probability with the test of power under to make positive and negative conclusions no matter the samples come from same population under confident coefficient a or not, by which we can realize an analysis at both macroscopic and microcosmic levels, as an important similar analytical method for medical theoretical research.
Goodness of fit of probability distributions for sightings as species approach extinction.

PubMed

Vogel, Richard M; Hosking, Jonathan R M; Elphick, Chris S; Roberts, David L; Reed, J Michael

2009-04-01

Estimating the probability that a species is extinct and the timing of extinctions is useful in biological fields ranging from paleoecology to conservation biology. Various statistical methods have been introduced to infer the time of extinction and extinction probability from a series of individual sightings. There is little evidence, however, as to which of these models provide adequate fit to actual sighting records. We use L-moment diagrams and probability plot correlation coefficient (PPCC) hypothesis tests to evaluate the goodness of fit of various probabilistic models to sighting data collected for a set of North American and Hawaiian bird populations that have either gone extinct, or are suspected of having gone extinct, during the past 150 years. For our data, the uniform, truncated exponential, and generalized Pareto models performed moderately well, but the Weibull model performed poorly. Of the acceptable models, the uniform distribution performed best based on PPCC goodness of fit comparisons and sequential Bonferroni-type tests. Further analyses using field significance tests suggest that although the uniform distribution is the best of those considered, additional work remains to evaluate the truncated exponential model more fully. The methods we present here provide a framework for evaluating subsequent models.
Properties of the probability distribution associated with the largest event in an earthquake cluster and their implications to foreshocks.

PubMed

Zhuang, Jiancang; Ogata, Yosihiko

2006-04-01

The space-time epidemic-type aftershock sequence model is a stochastic branching process in which earthquake activity is classified into background and clustering components and each earthquake triggers other earthquakes independently according to certain rules. This paper gives the probability distributions associated with the largest event in a cluster and their properties for all three cases when the process is subcritical, critical, and supercritical. One of the direct uses of these probability distributions is to evaluate the probability of an earthquake to be a foreshock, and magnitude distributions of foreshocks and nonforeshock earthquakes. To verify these theoretical results, the Japan Meteorological Agency earthquake catalog is analyzed. The proportion of events that have 1 or more larger descendants in total events is found to be as high as about 15%. When the differences between background events and triggered event in the behavior of triggering children are considered, a background event has a probability about 8% to be a foreshock. This probability decreases when the magnitude of the background event increases. These results, obtained from a complicated clustering model, where the characteristics of background events and triggered events are different, are consistent with the results obtained in [Ogata, Geophys. J. Int. 127, 17 (1996)] by using the conventional single-linked cluster declustering method.
Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics.

PubMed

Bonomi, M; Barducci, A; Parrinello, M

2009-08-01

Metadynamics is a widely used and successful method for reconstructing the free-energy surface of complex systems as a function of a small number of suitably chosen collective variables. This is achieved by biasing the dynamics of the system. The bias acting on the collective variables distorts the probability distribution of the other variables. Here we present a simple reweighting algorithm for recovering the unbiased probability distribution of any variable from a well-tempered metadynamics simulation. We show the efficiency of the reweighting procedure by reconstructing the distribution of the four backbone dihedral angles of alanine dipeptide from two and even one dimensional metadynamics simulation. 2009 Wiley Periodicals, Inc.
A method of decision analysis quantifying the effects of age and comorbidities on the probability of deriving significant benefit from medical treatments

PubMed Central

Bean, Nigel G.; Ruberu, Ravi P.

2017-01-01

Background The external validity, or generalizability, of trials and guidelines has been considered poor in the context of multiple morbidity. How multiple morbidity might affect the magnitude of benefit of a given treatment, and thereby external validity, has had little study. Objective To provide a method of decision analysis to quantify the effects of age and comorbidity on the probability of deriving a given magnitude of treatment benefit. Design We developed a method to calculate probabilistically the effect of all of a patient’s comorbidities on their underlying utility, or well-being, at a future time point. From this, we derived a distribution of possible magnitudes of treatment benefit at that future time point. We then expressed this distribution as the probability of deriving at least a given magnitude of treatment benefit. To demonstrate the applicability of this method of decision analysis, we applied it to the treatment of hypercholesterolaemia in a geriatric population of 50 individuals. We highlighted the results of four of these individuals. Results This method of analysis provided individualized quantifications of the effect of age and comorbidity on the probability of treatment benefit. The average probability of deriving a benefit, of at least 50% of the magnitude of benefit available to an individual without comorbidity, was only 0.8%. Conclusion The effects of age and comorbidity on the probability of deriving significant treatment benefits can be quantified for any individual. Even without consideration of other factors affecting external validity, these effects may be sufficient to guide decision-making. PMID:29090189
Nonlinear Spatial Inversion Without Monte Carlo Sampling

NASA Astrophysics Data System (ADS)

Curtis, A.; Nawaz, A.

2017-12-01

High-dimensional, nonlinear inverse or inference problems usually have non-unique solutions. The distribution of solutions are described by probability distributions, and these are usually found using Monte Carlo (MC) sampling methods. These take pseudo-random samples of models in parameter space, calculate the probability of each sample given available data and other information, and thus map out high or low probability values of model parameters. However, such methods would converge to the solution only as the number of samples tends to infinity; in practice, MC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. We propose a method for Bayesian inversion of categorical variables such as geological facies or rock types in spatial problems, which requires no sampling at all. The method uses a 2-D Hidden Markov Model over a grid of cells, where observations represent localized data constraining the model in each cell. The data in our example application are seismic properties such as P- and S-wave impedances or rock density; our model parameters are the hidden states and represent the geological rock types in each cell. The observations at each location are assumed to depend on the facies at that location only - an assumption referred to as `localized likelihoods'. However, the facies at a location cannot be determined solely by the observation at that location as it also depends on prior information concerning its correlation with the spatial distribution of facies elsewhere. Such prior information is included in the inversion in the form of a training image which represents a conceptual depiction of the distribution of local geologies that might be expected, but other forms of prior information can be used in the method as desired. The method provides direct (pseudo-analytic) estimates of posterior marginal probability distributions over each variable, so these do not need to be estimated from samples as is required in MC methods. On a 2-D test example the method is shown to outperform previous methods significantly, and at a fraction of the computational cost. In many foreseeable applications there are therefore no serious impediments to extending the method to 3-D spatial models.
Comparative analysis through probability distributions of a data set

NASA Astrophysics Data System (ADS)

Cristea, Gabriel; Constantinescu, Dan Mihai

2018-02-01

In practice, probability distributions are applied in such diverse fields as risk analysis, reliability engineering, chemical engineering, hydrology, image processing, physics, market research, business and economic research, customer support, medicine, sociology, demography etc. This article highlights important aspects of fitting probability distributions to data and applying the analysis results to make informed decisions. There are a number of statistical methods available which can help us to select the best fitting model. Some of the graphs display both input data and fitted distributions at the same time, as probability density and cumulative distribution. The goodness of fit tests can be used to determine whether a certain distribution is a good fit. The main used idea is to measure the "distance" between the data and the tested distribution, and compare that distance to some threshold values. Calculating the goodness of fit statistics also enables us to order the fitted distributions accordingly to how good they fit to data. This particular feature is very helpful for comparing the fitted models. The paper presents a comparison of most commonly used goodness of fit tests as: Kolmogorov-Smirnov, Anderson-Darling, and Chi-Squared. A large set of data is analyzed and conclusions are drawn by visualizing the data, comparing multiple fitted distributions and selecting the best model. These graphs should be viewed as an addition to the goodness of fit tests.
A microcomputer program for energy assessment and aggregation using the triangular probability distribution

USGS Publications Warehouse

Crovelli, R.A.; Balay, R.H.

1991-01-01

A general risk-analysis method was developed for petroleum-resource assessment and other applications. The triangular probability distribution is used as a model with an analytic aggregation methodology based on probability theory rather than Monte-Carlo simulation. Among the advantages of the analytic method are its computational speed and flexibility, and the saving of time and cost on a microcomputer. The input into the model consists of a set of components (e.g. geologic provinces) and, for each component, three potential resource estimates: minimum, most likely (mode), and maximum. Assuming a triangular probability distribution, the mean, standard deviation, and seven fractiles (F100, F95, F75, F50, F25, F5, and F0) are computed for each component, where for example, the probability of more than F95 is equal to 0.95. The components are aggregated by combining the means, standard deviations, and respective fractiles under three possible siutations (1) perfect positive correlation, (2) complete independence, and (3) any degree of dependence between these two polar situations. A package of computer programs named the TRIAGG system was written in the Turbo Pascal 4.0 language for performing the analytic probabilistic methodology. The system consists of a program for processing triangular probability distribution assessments and aggregations, and a separate aggregation routine for aggregating aggregations. The user's documentation and program diskette of the TRIAGG system are available from USGS Open File Services. TRIAGG requires an IBM-PC/XT/AT compatible microcomputer with 256kbyte of main memory, MS-DOS 3.1 or later, either two diskette drives or a fixed disk, and a 132 column printer. A graphics adapter and color display are optional. ?? 1991.
Quantitative assessment of building fire risk to life safety.

PubMed

Guanquan, Chu; Jinhua, Sun

2008-06-01

This article presents a quantitative risk assessment framework for evaluating fire risk to life safety. Fire risk is divided into two parts: probability and corresponding consequence of every fire scenario. The time-dependent event tree technique is used to analyze probable fire scenarios based on the effect of fire protection systems on fire spread and smoke movement. To obtain the variation of occurrence probability with time, Markov chain is combined with a time-dependent event tree for stochastic analysis on the occurrence probability of fire scenarios. To obtain consequences of every fire scenario, some uncertainties are considered in the risk analysis process. When calculating the onset time to untenable conditions, a range of fires are designed based on different fire growth rates, after which uncertainty of onset time to untenable conditions can be characterized by probability distribution. When calculating occupant evacuation time, occupant premovement time is considered as a probability distribution. Consequences of a fire scenario can be evaluated according to probability distribution of evacuation time and onset time of untenable conditions. Then, fire risk to life safety can be evaluated based on occurrence probability and consequences of every fire scenario. To express the risk assessment method in detail, a commercial building is presented as a case study. A discussion compares the assessment result of the case study with fire statistics.
Percolation

NASA Astrophysics Data System (ADS)

Dã¡Vila, Alã¡N.; Escudero, Christian; López, Jorge, , Dr.

2004-10-01

Several methods have been developed in order to study phase transitions in nuclear fragmentation. The one used in this research is Percolation. This method allows us to adjust resulting data to heavy ion collisions experiments. In systems, such as atomic nuclei or molecules, energy is put into the system. The system's particles move away from each other until their links are broken. Some particles will still be linked. The fragments' distribution is found to be a power law. We are witnessing then a critical phenomenon. In our model the particles are represented as occupied spaces in a cubical array. Each particle has a bound to each one of its 6 neighbors. Each bound can be active if the two particles are linked or inactive if they are not. When two or more particles are linked, a fragment is formed. The probability for a specific link to be broken cannot be calculated, so the probability for a bound to be active is going to be used as parameter when trying to adjust the data. For a given probability p several arrays are generated. The fragments are counted. The fragments' distribution is then adjusted to a power law. The probability that generates the better fit is going to be the critical probability that indicates a phase transition. The better fit is found by seeking the fragments' distribution that gives the minimal chi squared when compared to a power law. As additional evidence of criticality the entropy and normalized variance of the mass are also calculated for each probability.
The Laplace method for probability measures in Banach spaces

NASA Astrophysics Data System (ADS)

Piterbarg, V. I.; Fatalov, V. R.

1995-12-01

Contents §1. Introduction Chapter I. Asymptotic analysis of continual integrals in Banach space, depending on a large parameter §2. The large deviation principle and logarithmic asymptotics of continual integrals §3. Exact asymptotics of Gaussian integrals in Banach spaces: the Laplace method 3.1. The Laplace method for Gaussian integrals taken over the whole Hilbert space: isolated minimum points ([167], I) 3.2. The Laplace method for Gaussian integrals in Hilbert space: the manifold of minimum points ([167], II) 3.3. The Laplace method for Gaussian integrals in Banach space ([90], [174], [176]) 3.4. Exact asymptotics of large deviations of Gaussian norms §4. The Laplace method for distributions of sums of independent random elements with values in Banach space 4.1. The case of a non-degenerate minimum point ([137], I) 4.2. A degenerate isolated minimum point and the manifold of minimum points ([137], II) §5. Further examples 5.1. The Laplace method for the local time functional of a Markov symmetric process ([217]) 5.2. The Laplace method for diffusion processes, a finite number of non-degenerate minimum points ([116]) 5.3. Asymptotics of large deviations for Brownian motion in the Hölder norm 5.4. Non-asymptotic expansion of a strong stable law in Hilbert space ([41]) Chapter II. The double sum method - a version of the Laplace method in the space of continuous functions §6. Pickands' method of double sums 6.1. General situations 6.2. Asymptotics of the distribution of the maximum of a Gaussian stationary process 6.3. Asymptotics of the probability of a large excursion of a Gaussian non-stationary process §7. Probabilities of large deviations of trajectories of Gaussian fields 7.1. Homogeneous fields and fields with constant dispersion 7.2. Finitely many maximum points of dispersion 7.3. Manifold of maximum points of dispersion 7.4. Asymptotics of distributions of maxima of Wiener fields §8. Exact asymptotics of large deviations of the norm of Gaussian vectors and processes with values in the spaces L_k^p and l^2. Gaussian fields with the set of parameters in Hilbert space 8.1 Exact asymptotics of the distribution of the l_k^p-norm of a Gaussian finite-dimensional vector with dependent coordinates, p > 1 8.2. Exact asymptotics of probabilities of high excursions of trajectories of processes of type \\chi^2 8.3. Asymptotics of the probabilities of large deviations of Gaussian processes with a set of parameters in Hilbert space [74] 8.4. Asymptotics of distributions of maxima of the norms of l^2-valued Gaussian processes 8.5. Exact asymptotics of large deviations for the l^2-valued Ornstein-Uhlenbeck process Bibliography
Sampling--how big a sample?

PubMed

Aitken, C G

1999-07-01

It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.
Superstatistics analysis of the ion current distribution function: Met3PbCl influence study.

PubMed

Miśkiewicz, Janusz; Trela, Zenon; Przestalski, Stanisław; Karcz, Waldemar

2010-09-01

A novel analysis of ion current time series is proposed. It is shown that higher (second, third and fourth) statistical moments of the ion current probability distribution function (PDF) can yield new information about ion channel properties. The method is illustrated on a two-state model where the PDF of the compound states are given by normal distributions. The proposed method was applied to the analysis of the SV cation channels of vacuolar membrane of Beta vulgaris and the influence of trimethyllead chloride (Met(3)PbCl) on the ion current probability distribution. Ion currents were measured by patch-clamp technique. It was shown that Met(3)PbCl influences the variance of the open-state ion current but does not alter the PDF of the closed-state ion current. Incorporation of higher statistical moments into the standard investigation of ion channel properties is proposed.
Multilevel sequential Monte Carlo samplers

DOE PAGES

Beskos, Alexandros; Jasra, Ajay; Law, Kody; ...

2016-08-24

Here, we study the approximation of expectations w.r.t. probability distributions associated to the solution of partial differential equations (PDEs); this scenario appears routinely in Bayesian inverse problems. In practice, one often has to solve the associated PDE numerically, using, for instance finite element methods and leading to a discretisation bias, with the step-size level h L. In addition, the expectation cannot be computed analytically and one often resorts to Monte Carlo methods. In the context of this problem, it is known that the introduction of the multilevel Monte Carlo (MLMC) method can reduce the amount of computational effort to estimate expectations, for a given level of error. This is achieved via a telescoping identity associated to a Monte Carlo approximation of a sequence of probability distributions with discretisation levelsmore » $${\\infty}$$ >h 0>h 1 ...>h L. In many practical problems of interest, one cannot achieve an i.i.d. sampling of the associated sequence of probability distributions. A sequential Monte Carlo (SMC) version of the MLMC method is introduced to deal with this problem. In conclusion, it is shown that under appropriate assumptions, the attractive property of a reduction of the amount of computational effort to estimate expectations, for a given level of error, can be maintained within the SMC context.« less

ATTITUDE FILTERING ON SO(3)

NASA Technical Reports Server (NTRS)

Markley, F. Landis

2005-01-01

A new method is presented for the simultaneous estimation of the attitude of a spacecraft and an N-vector of bias parameters. This method uses a probability distribution function defined on the Cartesian product of SO(3), the group of rotation matrices, and the Euclidean space W N .The Fokker-Planck equation propagates the probability distribution function between measurements, and Bayes s formula incorporates measurement update information. This approach avoids all the issues of singular attitude representations or singular covariance matrices encountered in extended Kalman filters. In addition, the filter has a consistent initialization for a completely unknown initial attitude, owing to the fact that SO(3) is a compact space.
Lake bed classification using acoustic data

USGS Publications Warehouse

Yin, Karen K.; Li, Xing; Bonde, John; Richards, Carl; Cholwek, Gary

1998-01-01

As part of our effort to identify the lake bed surficial substrates using remote sensing data, this work designs pattern classifiers by multivariate statistical methods. Probability distribution of the preprocessed acoustic signal is analyzed first. A confidence region approach is then adopted to improve the design of the existing classifier. A technique for further isolation is proposed which minimizes the expected loss from misclassification. The devices constructed are applicable for real-time lake bed categorization. A mimimax approach is suggested to treat more general cases where the a priori probability distribution of the substrate types is unknown. Comparison of the suggested methods with the traditional likelihood ratio tests is discussed.
A multimodal detection model of dolphins to estimate abundance validated by field experiments.

PubMed

Akamatsu, Tomonari; Ura, Tamaki; Sugimatsu, Harumi; Bahl, Rajendar; Behera, Sandeep; Panda, Sudarsan; Khan, Muntaz; Kar, S K; Kar, C S; Kimura, Satoko; Sasaki-Yamamoto, Yukiko

2013-09-01

Abundance estimation of marine mammals requires matching of detection of an animal or a group of animal by two independent means. A multimodal detection model using visual and acoustic cues (surfacing and phonation) that enables abundance estimation of dolphins is proposed. The method does not require a specific time window to match the cues of both means for applying mark-recapture method. The proposed model was evaluated using data obtained in field observations of Ganges River dolphins and Irrawaddy dolphins, as examples of dispersed and condensed distributions of animals, respectively. The acoustic detection probability was approximately 80%, 20% higher than that of visual detection for both species, regardless of the distribution of the animals in present study sites. The abundance estimates of Ganges River dolphins and Irrawaddy dolphins fairly agreed with the numbers reported in previous monitoring studies. The single animal detection probability was smaller than that of larger cluster size, as predicted by the model and confirmed by field data. However, dense groups of Irrawaddy dolphins showed difference in cluster sizes observed by visual and acoustic methods. Lower detection probability of single clusters of this species seemed to be caused by the clumped distribution of this species.
Value assignment and uncertainty evaluation for single-element reference solutions

NASA Astrophysics Data System (ADS)

Possolo, Antonio; Bodnar, Olha; Butler, Therese A.; Molloy, John L.; Winchester, Michael R.

2018-06-01

A Bayesian statistical procedure is proposed for value assignment and uncertainty evaluation for the mass fraction of the elemental analytes in single-element solutions distributed as NIST standard reference materials. The principal novelty that we describe is the use of information about relative differences observed historically between the measured values obtained via gravimetry and via high-performance inductively coupled plasma optical emission spectrometry, to quantify the uncertainty component attributable to between-method differences. This information is encapsulated in a prior probability distribution for the between-method uncertainty component, and it is then used, together with the information provided by current measurement data, to produce a probability distribution for the value of the measurand from which an estimate and evaluation of uncertainty are extracted using established statistical procedures.
A moment-convergence method for stochastic analysis of biochemical reaction networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Jiajun; Nie, Qing; Zhou, Tianshou, E-mail: mcszhtsh@mail.sysu.edu.cn

Traditional moment-closure methods need to assume that high-order cumulants of a probability distribution approximate to zero. However, this strong assumption is not satisfied for many biochemical reaction networks. Here, we introduce convergent moments (defined in mathematics as the coefficients in the Taylor expansion of the probability-generating function at some point) to overcome this drawback of the moment-closure methods. As such, we develop a new analysis method for stochastic chemical kinetics. This method provides an accurate approximation for the master probability equation (MPE). In particular, the connection between low-order convergent moments and rate constants can be more easily derived in termsmore » of explicit and analytical forms, allowing insights that would be difficult to obtain through direct simulation or manipulation of the MPE. In addition, it provides an accurate and efficient way to compute steady-state or transient probability distribution, avoiding the algorithmic difficulty associated with stiffness of the MPE due to large differences in sizes of rate constants. Applications of the method to several systems reveal nontrivial stochastic mechanisms of gene expression dynamics, e.g., intrinsic fluctuations can induce transient bimodality and amplify transient signals, and slow switching between promoter states can increase fluctuations in spatially heterogeneous signals. The overall approach has broad applications in modeling, analysis, and computation of complex biochemical networks with intrinsic noise.« less
Modelling road accident blackspots data with the discrete generalized Pareto distribution.

PubMed

Prieto, Faustino; Gómez-Déniz, Emilio; Sarabia, José María

2014-10-01

This study shows how road traffic networks events, in particular road accidents on blackspots, can be modelled with simple probabilistic distributions. We considered the number of crashes and the number of fatalities on Spanish blackspots in the period 2003-2007, from Spanish General Directorate of Traffic (DGT). We modelled those datasets, respectively, with the discrete generalized Pareto distribution (a discrete parametric model with three parameters) and with the discrete Lomax distribution (a discrete parametric model with two parameters, and particular case of the previous model). For that, we analyzed the basic properties of both parametric models: cumulative distribution, survival, probability mass, quantile and hazard functions, genesis and rth-order moments; applied two estimation methods of their parameters: the μ and (μ+1) frequency method and the maximum likelihood method; used two goodness-of-fit tests: Chi-square test and discrete Kolmogorov-Smirnov test based on bootstrap resampling; and compared them with the classical negative binomial distribution in terms of absolute probabilities and in models including covariates. We found that those probabilistic models can be useful to describe the road accident blackspots datasets analyzed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Probability Distribution Estimated From the Minimum, Maximum, and Most Likely Values: Applied to Turbine Inlet Temperature Uncertainty

NASA Technical Reports Server (NTRS)

Holland, Frederic A., Jr.

2004-01-01

Modern engineering design practices are tending more toward the treatment of design parameters as random variables as opposed to fixed, or deterministic, values. The probabilistic design approach attempts to account for the uncertainty in design parameters by representing them as a distribution of values rather than as a single value. The motivations for this effort include preventing excessive overdesign as well as assessing and assuring reliability, both of which are important for aerospace applications. However, the determination of the probability distribution is a fundamental problem in reliability analysis. A random variable is often defined by the parameters of the theoretical distribution function that gives the best fit to experimental data. In many cases the distribution must be assumed from very limited information or data. Often the types of information that are available or reasonably estimated are the minimum, maximum, and most likely values of the design parameter. For these situations the beta distribution model is very convenient because the parameters that define the distribution can be easily determined from these three pieces of information. Widely used in the field of operations research, the beta model is very flexible and is also useful for estimating the mean and standard deviation of a random variable given only the aforementioned three values. However, an assumption is required to determine the four parameters of the beta distribution from only these three pieces of information (some of the more common distributions, like the normal, lognormal, gamma, and Weibull distributions, have two or three parameters). The conventional method assumes that the standard deviation is a certain fraction of the range. The beta parameters are then determined by solving a set of equations simultaneously. A new method developed in-house at the NASA Glenn Research Center assumes a value for one of the beta shape parameters based on an analogy with the normal distribution (ref.1). This new approach allows for a very simple and direct algebraic solution without restricting the standard deviation. The beta parameters obtained by the new method are comparable to the conventional method (and identical when the distribution is symmetrical). However, the proposed method generally produces a less peaked distribution with a slightly larger standard deviation (up to 7 percent) than the conventional method in cases where the distribution is asymmetric or skewed. The beta distribution model has now been implemented into the Fast Probability Integration (FPI) module used in the NESSUS computer code for probabilistic analyses of structures (ref. 2).
On the use of the energy probability distribution zeros in the study of phase transitions

NASA Astrophysics Data System (ADS)

Mól, L. A. S.; Rodrigues, R. G. M.; Stancioli, R. A.; Rocha, J. C. S.; Costa, B. V.

2018-04-01

This contribution is devoted to cover some technical aspects related to the use of the recently proposed energy probability distribution zeros in the study of phase transitions. This method is based on the partial knowledge of the partition function zeros and has been shown to be extremely efficient to precisely locate phase transition temperatures. It is based on an iterative method in such a way that the transition temperature can be approached at will. The iterative method will be detailed and some convergence issues that has been observed in its application to the 2D Ising model and to an artificial spin ice model will be shown, together with ways to circumvent them.
Statistical Inference in Graphical Models

DTIC Science & Technology

2008-06-17

fuse probability theory and graph theory in such a way as to permit efficient rep- resentation and computation with probability distributions. They...message passing. 59 viii 1. INTRODUCTION In approaching real-world problems, we often need to deal with uncertainty. Probability and statis- tics provide a...dynamic programming methods. However, for many sensors of interest, the signal-to-noise ratio does not allow such a treatment. Another source of
On-Orbit Collision Hazard Analysis in Low Earth Orbit Using the Poisson Probability Distribution (Version 1.0)

DOT National Transportation Integrated Search

1992-08-26

This document provides the basic information needed to estimate a general : probability of collision in Low Earth Orbit (LEO). Although the method : described in this primer is a first order approximation, its results are : reasonable. Furthermore, t...
Extinction time of a stochastic predator-prey model by the generalized cell mapping method

NASA Astrophysics Data System (ADS)

Han, Qun; Xu, Wei; Hu, Bing; Huang, Dongmei; Sun, Jian-Qiao

2018-03-01

The stochastic response and extinction time of a predator-prey model with Gaussian white noise excitations are studied by the generalized cell mapping (GCM) method based on the short-time Gaussian approximation (STGA). The methods for stochastic response probability density functions (PDFs) and extinction time statistics are developed. The Taylor expansion is used to deal with non-polynomial nonlinear terms of the model for deriving the moment equations with Gaussian closure, which are needed for the STGA in order to compute the one-step transition probabilities. The work is validated with direct Monte Carlo simulations. We have presented the transient responses showing the evolution from a Gaussian initial distribution to a non-Gaussian steady-state one. The effects of the model parameter and noise intensities on the steady-state PDFs are discussed. It is also found that the effects of noise intensities on the extinction time statistics are opposite to the effects on the limit probability distributions of the survival species.
Bayesian data analysis tools for atomic physics

NASA Astrophysics Data System (ADS)

Trassinelli, Martino

2017-10-01

We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.
Calculation of the number of Monte Carlo histories for a planetary protection probability of impact estimation

NASA Astrophysics Data System (ADS)

Barengoltz, Jack

2016-07-01

Monte Carlo (MC) is a common method to estimate probability, effectively by a simulation. For planetary protection, it may be used to estimate the probability of impact P{}_{I} by a launch vehicle (upper stage) of a protected planet. The object of the analysis is to provide a value for P{}_{I} with a given level of confidence (LOC) that the true value does not exceed the maximum allowed value of P{}_{I}. In order to determine the number of MC histories required, one must also guess the maximum number of hits that will occur in the analysis. This extra parameter is needed because a LOC is desired. If more hits occur, the MC analysis would indicate that the true value may exceed the specification value with a higher probability than the LOC. (In the worst case, even the mean value of the estimated P{}_{I} might exceed the specification value.) After the analysis is conducted, the actual number of hits is, of course, the mean. The number of hits arises from a small probability per history and a large number of histories; these are the classic requirements for a Poisson distribution. For a known Poisson distribution (the mean is the only parameter), the probability for some interval in the number of hits is calculable. Before the analysis, this is not possible. Fortunately, there are methods that can bound the unknown mean for a Poisson distribution. F. Garwoodfootnote{ F. Garwood (1936), ``Fiduciary limits for the Poisson distribution.'' Biometrika 28, 437-442.} published an appropriate method that uses the Chi-squared function, actually its inversefootnote{ The integral chi-squared function would yield probability α as a function of the mean µ and an actual value n.} (despite the notation used): This formula for the upper and lower limits of the mean μ with the two-tailed probability 1-α depends on the LOC α and an estimated value of the number of "successes" n. In a MC analysis for planetary protection, only the upper limit is of interest, i.e., the single-tailed distribution. (Smaller actual P{}_{I }is no problem.) {}_{ } One advantage of this method is that this function is available in EXCEL. Note that care must be taken with the definition of the CHIINV function (the inverse of the integral chi-squared distribution). The equivalent inequality in EXCEL is μ < CHIINV[1-α, 2(n+1)] In practice, one calculates this upper limit for a specified LOC, α , and a guess of how many hits n will be found after the MC analysis. Then the estimate of the number of histories required is this upper limit divided by the specification for the allowed P{}_{I} (rounded up). However, if the number of hits actually exceeds the guess, the P{}_{I} requirement will be met only with a smaller LOC. A disadvantage is that the intervals about the mean are "in general too wide, yielding coverage probabilities much greater than 1- α ." footnote{ G. Casella and C. Robert (1988), Purdue University-Technical Report #88-7 or Cornell University-Technical Report BU-903-M.} For planetary protection, this technical issue means that the upper limit of the interval and the probability associated with the interval (i.e., the LOC) are conservative.
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.

PubMed

Knobles, D P; Sagers, J D; Koch, R A

2012-02-01

A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
Radial particle distributions in PARMILA simulation beams

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boicourt, G.P.

1984-03-01

The estimation of beam spill in particle accelerators is becoming of greater importance as higher current designs are being funded. To the present, no numerical method for predicting beam-spill has been available. In this paper, we present an approach to the loss-estimation problem that uses probability distributions fitted to particle-simulation beams. The properties of the PARMILA code's radial particle distribution are discussed, and a broad class of probability distributions are examined to check their ability to fit it. The possibility that the PARMILA distribution is a mixture is discussed, and a fitting distribution consisting of a mixture of two generalizedmore » gamma distributions is found. An efficient algorithm to accomplish the fit is presented. Examples of the relative prediction of beam spill are given. 26 references, 18 figures, 1 table.« less
Technology Development Risk Assessment for Space Transportation Systems

NASA Technical Reports Server (NTRS)

Mathias, Donovan L.; Godsell, Aga M.; Go, Susie

2006-01-01

A new approach for assessing development risk associated with technology development projects is presented. The method represents technology evolution in terms of sector-specific discrete development stages. A Monte Carlo simulation is used to generate development probability distributions based on statistical models of the discrete transitions. Development risk is derived from the resulting probability distributions and specific program requirements. Two sample cases are discussed to illustrate the approach, a single rocket engine development and a three-technology space transportation portfolio.
A probabilistic approach to photovoltaic generator performance prediction

NASA Astrophysics Data System (ADS)

Khallat, M. A.; Rahman, S.

1986-09-01

A method for predicting the performance of a photovoltaic (PV) generator based on long term climatological data and expected cell performance is described. The equations for cell model formulation are provided. Use of the statistical model for characterizing the insolation level is discussed. The insolation data is fitted to appropriate probability distribution functions (Weibull, beta, normal). The probability distribution functions are utilized to evaluate the capacity factors of PV panels or arrays. An example is presented revealing the applicability of the procedure.
Probability techniques for reliability analysis of composite materials

NASA Technical Reports Server (NTRS)

Wetherhold, Robert C.; Ucci, Anthony M.

1994-01-01

Traditional design approaches for composite materials have employed deterministic criteria for failure analysis. New approaches are required to predict the reliability of composite structures since strengths and stresses may be random variables. This report will examine and compare methods used to evaluate the reliability of composite laminae. The two types of methods that will be evaluated are fast probability integration (FPI) methods and Monte Carlo methods. In these methods, reliability is formulated as the probability that an explicit function of random variables is less than a given constant. Using failure criteria developed for composite materials, a function of design variables can be generated which defines a 'failure surface' in probability space. A number of methods are available to evaluate the integration over the probability space bounded by this surface; this integration delivers the required reliability. The methods which will be evaluated are: the first order, second moment FPI methods; second order, second moment FPI methods; the simple Monte Carlo; and an advanced Monte Carlo technique which utilizes importance sampling. The methods are compared for accuracy, efficiency, and for the conservativism of the reliability estimation. The methodology involved in determining the sensitivity of the reliability estimate to the design variables (strength distributions) and importance factors is also presented.
Atom counting in HAADF STEM using a statistical model-based approach: methodology, possibilities, and inherent limitations.

PubMed

De Backer, A; Martinez, G T; Rosenauer, A; Van Aert, S

2013-11-01

In the present paper, a statistical model-based method to count the number of atoms of monotype crystalline nanostructures from high resolution high-angle annular dark-field (HAADF) scanning transmission electron microscopy (STEM) images is discussed in detail together with a thorough study on the possibilities and inherent limitations. In order to count the number of atoms, it is assumed that the total scattered intensity scales with the number of atoms per atom column. These intensities are quantitatively determined using model-based statistical parameter estimation theory. The distribution describing the probability that intensity values are generated by atomic columns containing a specific number of atoms is inferred on the basis of the experimental scattered intensities. Finally, the number of atoms per atom column is quantified using this estimated probability distribution. The number of atom columns available in the observed STEM image, the number of components in the estimated probability distribution, the width of the components of the probability distribution, and the typical shape of a criterion to assess the number of components in the probability distribution directly affect the accuracy and precision with which the number of atoms in a particular atom column can be estimated. It is shown that single atom sensitivity is feasible taking the latter aspects into consideration. © 2013 Elsevier B.V. All rights reserved.
Moving on From Representativeness: Testing the Utility of the Global Drug Survey.

PubMed

Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R

2017-01-01

A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population.

Moving on From Representativeness: Testing the Utility of the Global Drug Survey

PubMed Central

Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R

2017-01-01

A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population. PMID:28924351
New S control chart using skewness correction method for monitoring process dispersion of skewed distributions

NASA Astrophysics Data System (ADS)

Atta, Abdu; Yahaya, Sharipah; Zain, Zakiyah; Ahmed, Zalikha

2017-11-01

Control chart is established as one of the most powerful tools in Statistical Process Control (SPC) and is widely used in industries. The conventional control charts rely on normality assumption, which is not always the case for industrial data. This paper proposes a new S control chart for monitoring process dispersion using skewness correction method for skewed distributions, named as SC-S control chart. Its performance in terms of false alarm rate is compared with various existing control charts for monitoring process dispersion, such as scaled weighted variance S chart (SWV-S); skewness correction R chart (SC-R); weighted variance R chart (WV-R); weighted variance S chart (WV-S); and standard S chart (STD-S). Comparison with exact S control chart with regards to the probability of out-of-control detections is also accomplished. The Weibull and gamma distributions adopted in this study are assessed along with the normal distribution. Simulation study shows that the proposed SC-S control chart provides good performance of in-control probabilities (Type I error) in almost all the skewness levels and sample sizes, n. In the case of probability of detection shift the proposed SC-S chart is closer to the exact S control chart than the existing charts for skewed distributions, except for the SC-R control chart. In general, the performance of the proposed SC-S control chart is better than all the existing control charts for monitoring process dispersion in the cases of Type I error and probability of detection shift.
Bivariate sub-Gaussian model for stock index returns

NASA Astrophysics Data System (ADS)

Jabłońska-Sabuka, Matylda; Teuerle, Marek; Wyłomańska, Agnieszka

2017-11-01

Financial time series are commonly modeled with methods assuming data normality. However, the real distribution can be nontrivial, also not having an explicitly formulated probability density function. In this work we introduce novel parameter estimation and high-powered distribution testing methods which do not rely on closed form densities, but use the characteristic functions for comparison. The approach applied to a pair of stock index returns demonstrates that such a bivariate vector can be a sample coming from a bivariate sub-Gaussian distribution. The methods presented here can be applied to any nontrivially distributed financial data, among others.
Sampling probability distributions of lesions in mammograms

NASA Astrophysics Data System (ADS)

Looney, P.; Warren, L. M.; Dance, D. R.; Young, K. C.

2015-03-01

One approach to image perception studies in mammography using virtual clinical trials involves the insertion of simulated lesions into normal mammograms. To facilitate this, a method has been developed that allows for sampling of lesion positions across the cranio-caudal and medio-lateral radiographic projections in accordance with measured distributions of real lesion locations. 6825 mammograms from our mammography image database were segmented to find the breast outline. The outlines were averaged and smoothed to produce an average outline for each laterality and radiographic projection. Lesions in 3304 mammograms with malignant findings were mapped on to a standardised breast image corresponding to the average breast outline using piecewise affine transforms. A four dimensional probability distribution function was found from the lesion locations in the cranio-caudal and medio-lateral radiographic projections for calcification and noncalcification lesions. Lesion locations sampled from this probability distribution function were mapped on to individual mammograms using a piecewise affine transform which transforms the average outline to the outline of the breast in the mammogram. The four dimensional probability distribution function was validated by comparing it to the two dimensional distributions found by considering each radiographic projection and laterality independently. The correlation of the location of the lesions sampled from the four dimensional probability distribution function across radiographic projections was shown to match the correlation of the locations of the original mapped lesion locations. The current system has been implemented as a web-service on a server using the Python Django framework. The server performs the sampling, performs the mapping and returns the results in a javascript object notation format.
Probabilistic sensitivity analysis for decision trees with multiple branches: use of the Dirichlet distribution in a Bayesian framework.

PubMed

Briggs, Andrew H; Ades, A E; Price, Martin J

2003-01-01

In structuring decision models of medical interventions, it is commonly recommended that only 2 branches be used for each chance node to avoid logical inconsistencies that can arise during sensitivity analyses if the branching probabilities do not sum to 1. However, information may be naturally available in an unconditional form, and structuring a tree in conditional form may complicate rather than simplify the sensitivity analysis of the unconditional probabilities. Current guidance emphasizes using probabilistic sensitivity analysis, and a method is required to provide probabilistic probabilities over multiple branches that appropriately represents uncertainty while satisfying the requirement that mutually exclusive event probabilities should sum to 1. The authors argue that the Dirichlet distribution, the multivariate equivalent of the beta distribution, is appropriate for this purpose and illustrate its use for generating a fully probabilistic transition matrix for a Markov model. Furthermore, they demonstrate that by adopting a Bayesian approach, the problem of observing zero counts for transitions of interest can be overcome.
The probability distribution model of air pollution index and its dominants in Kuala Lumpur

NASA Astrophysics Data System (ADS)

AL-Dhurafi, Nasr Ahmed; Razali, Ahmad Mahir; Masseran, Nurulkamal; Zamzuri, Zamira Hasanah

2016-11-01

This paper focuses on the statistical modeling for the distributions of air pollution index (API) and its sub-indexes data observed at Kuala Lumpur in Malaysia. Five pollutants or sub-indexes are measured including, carbon monoxide (CO); sulphur dioxide (SO2); nitrogen dioxide (NO2), and; particulate matter (PM10). Four probability distributions are considered, namely log-normal, exponential, Gamma and Weibull in search for the best fit distribution to the Malaysian air pollutants data. In order to determine the best distribution for describing the air pollutants data, five goodness-of-fit criteria's are applied. This will help in minimizing the uncertainty in pollution resource estimates and improving the assessment phase of planning. The conflict in criterion results for selecting the best distribution was overcome by using the weight of ranks method. We found that the Gamma distribution is the best distribution for the majority of air pollutants data in Kuala Lumpur.
Probability distribution of extreme share returns in Malaysia

NASA Astrophysics Data System (ADS)

Zin, Wan Zawiah Wan; Safari, Muhammad Aslam Mohd; Jaaman, Saiful Hafizah; Yie, Wendy Ling Shin

2014-09-01

The objective of this study is to investigate the suitable probability distribution to model the extreme share returns in Malaysia. To achieve this, weekly and monthly maximum daily share returns are derived from share prices data obtained from Bursa Malaysia over the period of 2000 to 2012. The study starts with summary statistics of the data which will provide a clue on the likely candidates for the best fitting distribution. Next, the suitability of six extreme value distributions, namely the Gumbel, Generalized Extreme Value (GEV), Generalized Logistic (GLO) and Generalized Pareto (GPA), the Lognormal (GNO) and the Pearson (PE3) distributions are evaluated. The method of L-moments is used in parameter estimation. Based on several goodness of fit tests and L-moment diagram test, the Generalized Pareto distribution and the Pearson distribution are found to be the best fitted distribution to represent the weekly and monthly maximum share returns in Malaysia stock market during the studied period, respectively.
Generation of pseudo-random numbers

NASA Technical Reports Server (NTRS)

Howell, L. W.; Rheinfurth, M. H.

1982-01-01

Practical methods for generating acceptable random numbers from a variety of probability distributions which are frequently encountered in engineering applications are described. The speed, accuracy, and guarantee of statistical randomness of the various methods are discussed.
A discrimination method for the detection of pneumonia using chest radiograph.

PubMed

Noor, Norliza Mohd; Rijal, Omar Mohd; Yunus, Ashari; Abu-Bakar, S A R

2010-03-01

This paper presents a statistical method for the detection of lobar pneumonia when using digitized chest X-ray films. Each region of interest was represented by a vector of wavelet texture measures which is then multiplied by the orthogonal matrix Q(2). The first two elements of the transformed vectors were shown to have a bivariate normal distribution. Misclassification probabilities were estimated using probability ellipsoids and discriminant functions. The result of this study recommends the detection of pneumonia by constructing probability ellipsoids or discriminant function using maximum energy and maximum column sum energy texture measures where misclassification probabilities were less than 0.15. 2009 Elsevier Ltd. All rights reserved.
Probability Forecasting Using Monte Carlo Simulation

NASA Astrophysics Data System (ADS)

Duncan, M.; Frisbee, J.; Wysack, J.

2014-09-01

Space Situational Awareness (SSA) is defined as the knowledge and characterization of all aspects of space. SSA is now a fundamental and critical component of space operations. Increased dependence on our space assets has in turn lead to a greater need for accurate, near real-time knowledge of all space activities. With the growth of the orbital debris population, satellite operators are performing collision avoidance maneuvers more frequently. Frequent maneuver execution expends fuel and reduces the operational lifetime of the spacecraft. Thus the need for new, more sophisticated collision threat characterization methods must be implemented. The collision probability metric is used operationally to quantify the collision risk. The collision probability is typically calculated days into the future, so that high risk and potential high risk conjunction events are identified early enough to develop an appropriate course of action. As the time horizon to the conjunction event is reduced, the collision probability changes. A significant change in the collision probability will change the satellite mission stakeholder's course of action. So constructing a method for estimating how the collision probability will evolve improves operations by providing satellite operators with a new piece of information, namely an estimate or 'forecast' of how the risk will change as time to the event is reduced. Collision probability forecasting is a predictive process where the future risk of a conjunction event is estimated. The method utilizes a Monte Carlo simulation that produces a likelihood distribution for a given collision threshold. Using known state and state uncertainty information, the simulation generates a set possible trajectories for a given space object pair. Each new trajectory produces a unique event geometry at the time of close approach. Given state uncertainty information for both objects, a collision probability value can be computed for every trail. This yields a collision probability distribution given known, predicted uncertainty. This paper presents the details of the collision probability forecasting method. We examine various conjunction event scenarios and numerically demonstrate the utility of this approach in typical event scenarios. We explore the utility of a probability-based track scenario simulation that models expected tracking data frequency as the tasking levels are increased. The resulting orbital uncertainty is subsequently used in the forecasting algorithm.
Audio feature extraction using probability distribution function

NASA Astrophysics Data System (ADS)

Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.

2015-05-01

Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.
Copula Models for Sociology: Measures of Dependence and Probabilities for Joint Distributions

ERIC Educational Resources Information Center

Vuolo, Mike

2017-01-01

Often in sociology, researchers are confronted with nonnormal variables whose joint distribution they wish to explore. Yet, assumptions of common measures of dependence can fail or estimating such dependence is computationally intensive. This article presents the copula method for modeling the joint distribution of two random variables, including…
Partial knowledge, entropy, and estimation

PubMed Central

MacQueen, James; Marschak, Jacob

1975-01-01

In a growing body of literature, available partial knowledge is used to estimate the prior probability distribution p≡(p1,...,pn) by maximizing entropy H(p)≡-Σpi log pi, subject to constraints on p which express that partial knowledge. The method has been applied to distributions of income, of traffic, of stock-price changes, and of types of brand-article purchases. We shall respond to two justifications given for the method: (α) It is “conservative,” and therefore good, to maximize “uncertainty,” as (uniquely) represented by the entropy parameter. (β) One should apply the mathematics of statistical thermodynamics, which implies that the most probable distribution has highest entropy. Reason (α) is rejected. Reason (β) is valid when “complete ignorance” is defined in a particular way and both the constraint and the estimator's loss function are of certain kinds. PMID:16578733
Continuous-Time Classical and Quantum Random Walk on Direct Product of Cayley Graphs

NASA Astrophysics Data System (ADS)

Salimi, S.; Jafarizadeh, M. A.

2009-06-01

In this paper we define direct product of graphs and give a recipe for obtaining probability of observing particle on vertices in the continuous-time classical and quantum random walk. In the recipe, the probability of observing particle on direct product of graph is obtained by multiplication of probability on the corresponding to sub-graphs, where this method is useful to determining probability of walk on complicated graphs. Using this method, we calculate the probability of continuous-time classical and quantum random walks on many of finite direct product Cayley graphs (complete cycle, complete Kn, charter and n-cube). Also, we inquire that the classical state the stationary uniform distribution is reached as t → ∞ but for quantum state is not always satisfied.
Review of probabilistic analysis of dynamic response of systems with random parameters

NASA Technical Reports Server (NTRS)

Kozin, F.; Klosner, J. M.

1989-01-01

The various methods that have been studied in the past to allow probabilistic analysis of dynamic response for systems with random parameters are reviewed. Dynamic response may have been obtained deterministically if the variations about the nominal values were small; however, for space structures which require precise pointing, the variations about the nominal values of the structural details and of the environmental conditions are too large to be considered as negligible. These uncertainties are accounted for in terms of probability distributions about their nominal values. The quantities of concern for describing the response of the structure includes displacements, velocities, and the distributions of natural frequencies. The exact statistical characterization of the response would yield joint probability distributions for the response variables. Since the random quantities will appear as coefficients, determining the exact distributions will be difficult at best. Thus, certain approximations will have to be made. A number of techniques that are available are discussed, even in the nonlinear case. The methods that are described were: (1) Liouville's equation; (2) perturbation methods; (3) mean square approximate systems; and (4) nonlinear systems with approximation by linear systems.
Exploiting vibrational resonance in weak-signal detection

NASA Astrophysics Data System (ADS)

Ren, Yuhao; Pan, Yan; Duan, Fabing; Chapeau-Blondeau, François; Abbott, Derek

2017-08-01

In this paper, we investigate the first exploitation of the vibrational resonance (VR) effect to detect weak signals in the presence of strong background noise. By injecting a series of sinusoidal interference signals of the same amplitude but with different frequencies into a generalized correlation detector, we show that the detection probability can be maximized at an appropriate interference amplitude. Based on a dual-Dirac probability density model, we compare the VR method with the stochastic resonance approach via adding dichotomous noise. The compared results indicate that the VR method can achieve a higher detection probability for a wider variety of noise distributions.
Exploiting vibrational resonance in weak-signal detection.

PubMed

Ren, Yuhao; Pan, Yan; Duan, Fabing; Chapeau-Blondeau, François; Abbott, Derek

2017-08-01

In this paper, we investigate the first exploitation of the vibrational resonance (VR) effect to detect weak signals in the presence of strong background noise. By injecting a series of sinusoidal interference signals of the same amplitude but with different frequencies into a generalized correlation detector, we show that the detection probability can be maximized at an appropriate interference amplitude. Based on a dual-Dirac probability density model, we compare the VR method with the stochastic resonance approach via adding dichotomous noise. The compared results indicate that the VR method can achieve a higher detection probability for a wider variety of noise distributions.
A Method for Evaluating Tuning Functions of Single Neurons based on Mutual Information Maximization

NASA Astrophysics Data System (ADS)

Brostek, Lukas; Eggert, Thomas; Ono, Seiji; Mustari, Michael J.; Büttner, Ulrich; Glasauer, Stefan

2011-03-01

We introduce a novel approach for evaluation of neuronal tuning functions, which can be expressed by the conditional probability of observing a spike given any combination of independent variables. This probability can be estimated out of experimentally available data. By maximizing the mutual information between the probability distribution of the spike occurrence and that of the variables, the dependence of the spike on the input variables is maximized as well. We used this method to analyze the dependence of neuronal activity in cortical area MSTd on signals related to movement of the eye and retinal image movement.
Modeling utilization distributions in space and time

USGS Publications Warehouse

Keating, K.A.; Cherry, S.

2009-01-01

W. Van Winkle defined the utilization distribution (UD) as a probability density that gives an animal's relative frequency of occurrence in a two-dimensional (x, y) plane. We extend Van Winkle's work by redefining the UD as the relative frequency distribution of an animal's occurrence in all four dimensions of space and time. We then describe a product kernel model estimation method, devising a novel kernel from the wrapped Cauchy distribution to handle circularly distributed temporal covariates, such as day of year. Using Monte Carlo simulations of animal movements in space and time, we assess estimator performance. Although not unbiased, the product kernel method yields models highly correlated (Pearson's r - 0.975) with true probabilities of occurrence and successfully captures temporal variations in density of occurrence. In an empirical example, we estimate the expected UD in three dimensions (x, y, and t) for animals belonging to each of two distinct bighorn sheep {Ovis canadensis) social groups in Glacier National Park, Montana, USA. Results show the method can yield ecologically informative models that successfully depict temporal variations in density of occurrence for a seasonally migratory species. Some implications of this new approach to UD modeling are discussed. ?? 2009 by the Ecological Society of America.
a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information

NASA Astrophysics Data System (ADS)

Lian, Shizhong; Chen, Jiangping; Luo, Minghai

2016-06-01

Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.

Automatic correction of intensity nonuniformity from sparseness of gradient distribution in medical images.

PubMed

Zheng, Yuanjie; Grossman, Murray; Awate, Suyash P; Gee, James C

2009-01-01

We propose to use the sparseness property of the gradient probability distribution to estimate the intensity nonuniformity in medical images, resulting in two novel automatic methods: a non-parametric method and a parametric method. Our methods are easy to implement because they both solve an iteratively re-weighted least squares problem. They are remarkably accurate as shown by our experiments on images of different imaged objects and from different imaging modalities.
Automatic Correction of Intensity Nonuniformity from Sparseness of Gradient Distribution in Medical Images

PubMed Central

Zheng, Yuanjie; Grossman, Murray; Awate, Suyash P.; Gee, James C.

2013-01-01

We propose to use the sparseness property of the gradient probability distribution to estimate the intensity nonuniformity in medical images, resulting in two novel automatic methods: a non-parametric method and a parametric method. Our methods are easy to implement because they both solve an iteratively re-weighted least squares problem. They are remarkably accurate as shown by our experiments on images of different imaged objects and from different imaging modalities. PMID:20426191
TU-AB-BRB-02: Stochastic Programming Methods for Handling Uncertainty and Motion in IMRT Planning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Unkelbach, J.

The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
TU-AB-BRB-00: New Methods to Ensure Target Coverage

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

2015-06-15

The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
Fingerprint Recognition with Identical Twin Fingerprints

PubMed Central

Yang, Xin; Tian, Jie

2012-01-01

Fingerprint recognition with identical twins is a challenging task due to the closest genetics-based relationship existing in the identical twins. Several pioneers have analyzed the similarity between twins' fingerprints. In this work we continue to investigate the topic of the similarity of identical twin fingerprints. Our study was tested based on a large identical twin fingerprint database that contains 83 twin pairs, 4 fingers per individual and six impressions per finger: 3984 (83*2*4*6) images. Compared to the previous work, our contributions are summarized as follows: (1) Two state-of-the-art fingerprint identification methods: P071 and VeriFinger 6.1 were used, rather than one fingerprint identification method in previous studies. (2) Six impressions per finger were captured, rather than just one impression, which makes the genuine distribution of matching scores more realistic. (3) A larger sample (83 pairs) was collected. (4) A novel statistical analysis, which aims at showing the probability distribution of the fingerprint types for the corresponding fingers of identical twins which have same fingerprint type, has been conducted. (5) A novel analysis, which aims at showing which finger from identical twins has higher probability of having same fingerprint type, has been conducted. Our results showed that: (a) A state-of-the-art automatic fingerprint verification system can distinguish identical twins without drastic degradation in performance. (b) The chance that the fingerprints have the same type from identical twins is 0.7440, comparing to 0.3215 from non-identical twins. (c) For the corresponding fingers of identical twins which have same fingerprint type, the probability distribution of five major fingerprint types is similar to the probability distribution for all the fingers' fingerprint type. (d) For each of four fingers of identical twins, the probability of having same fingerprint type is similar. PMID:22558204
Fingerprint recognition with identical twin fingerprints.

PubMed

Tao, Xunqiang; Chen, Xinjian; Yang, Xin; Tian, Jie

2012-01-01

Fingerprint recognition with identical twins is a challenging task due to the closest genetics-based relationship existing in the identical twins. Several pioneers have analyzed the similarity between twins' fingerprints. In this work we continue to investigate the topic of the similarity of identical twin fingerprints. Our study was tested based on a large identical twin fingerprint database that contains 83 twin pairs, 4 fingers per individual and six impressions per finger: 3984 (83*2*4*6) images. Compared to the previous work, our contributions are summarized as follows: (1) Two state-of-the-art fingerprint identification methods: P071 and VeriFinger 6.1 were used, rather than one fingerprint identification method in previous studies. (2) Six impressions per finger were captured, rather than just one impression, which makes the genuine distribution of matching scores more realistic. (3) A larger sample (83 pairs) was collected. (4) A novel statistical analysis, which aims at showing the probability distribution of the fingerprint types for the corresponding fingers of identical twins which have same fingerprint type, has been conducted. (5) A novel analysis, which aims at showing which finger from identical twins has higher probability of having same fingerprint type, has been conducted. Our results showed that: (a) A state-of-the-art automatic fingerprint verification system can distinguish identical twins without drastic degradation in performance. (b) The chance that the fingerprints have the same type from identical twins is 0.7440, comparing to 0.3215 from non-identical twins. (c) For the corresponding fingers of identical twins which have same fingerprint type, the probability distribution of five major fingerprint types is similar to the probability distribution for all the fingers' fingerprint type. (d) For each of four fingers of identical twins, the probability of having same fingerprint type is similar.
A chi-square goodness-of-fit test for non-identically distributed random variables: with application to empirical Bayes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Conover, W.J.; Cox, D.D.; Martz, H.F.

1997-12-01

When using parametric empirical Bayes estimation methods for estimating the binomial or Poisson parameter, the validity of the assumed beta or gamma conjugate prior distribution is an important diagnostic consideration. Chi-square goodness-of-fit tests of the beta or gamma prior hypothesis are developed for use when the binomial sample sizes or Poisson exposure times vary. Nine examples illustrate the application of the methods, using real data from such diverse applications as the loss of feedwater flow rates in nuclear power plants, the probability of failure to run on demand and the failure rates of the high pressure coolant injection systems atmore » US commercial boiling water reactors, the probability of failure to run on demand of emergency diesel generators in US commercial nuclear power plants, the rate of failure of aircraft air conditioners, baseball batting averages, the probability of testing positive for toxoplasmosis, and the probability of tumors in rats. The tests are easily applied in practice by means of corresponding Mathematica{reg_sign} computer programs which are provided.« less
Bayesian alternative to the ISO-GUM's use of the Welch Satterthwaite formula

NASA Astrophysics Data System (ADS)

Kacker, Raghu N.

2006-02-01

In certain disciplines, uncertainty is traditionally expressed as an interval about an estimate for the value of the measurand. Development of such uncertainty intervals with a stated coverage probability based on the International Organization for Standardization (ISO) Guide to the Expression of Uncertainty in Measurement (GUM) requires a description of the probability distribution for the value of the measurand. The ISO-GUM propagates the estimates and their associated standard uncertainties for various input quantities through a linear approximation of the measurement equation to determine an estimate and its associated standard uncertainty for the value of the measurand. This procedure does not yield a probability distribution for the value of the measurand. The ISO-GUM suggests that under certain conditions motivated by the central limit theorem the distribution for the value of the measurand may be approximated by a scaled-and-shifted t-distribution with effective degrees of freedom obtained from the Welch-Satterthwaite (W-S) formula. The approximate t-distribution may then be used to develop an uncertainty interval with a stated coverage probability for the value of the measurand. We propose an approximate normal distribution based on a Bayesian uncertainty as an alternative to the t-distribution based on the W-S formula. A benefit of the approximate normal distribution based on a Bayesian uncertainty is that it greatly simplifies the expression of uncertainty by eliminating altogether the need for calculating effective degrees of freedom from the W-S formula. In the special case where the measurand is the difference between two means, each evaluated from statistical analyses of independent normally distributed measurements with unknown and possibly unequal variances, the probability distribution for the value of the measurand is known to be a Behrens-Fisher distribution. We compare the performance of the approximate normal distribution based on a Bayesian uncertainty and the approximate t-distribution based on the W-S formula with respect to the Behrens-Fisher distribution. The approximate normal distribution is simpler and better in this case. A thorough investigation of the relative performance of the two approximate distributions would require comparison for a range of measurement equations by numerical methods.
Computing exact bundle compliance control charts via probability generating functions.

PubMed

Chen, Binchao; Matis, Timothy; Benneyan, James

2016-06-01

Compliance to evidenced-base practices, individually and in 'bundles', remains an important focus of healthcare quality improvement for many clinical conditions. The exact probability distribution of composite bundle compliance measures used to develop corresponding control charts and other statistical tests is based on a fairly large convolution whose direct calculation can be computationally prohibitive. Various series expansions and other approximation approaches have been proposed, each with computational and accuracy tradeoffs, especially in the tails. This same probability distribution also arises in other important healthcare applications, such as for risk-adjusted outcomes and bed demand prediction, with the same computational difficulties. As an alternative, we use probability generating functions to rapidly obtain exact results and illustrate the improved accuracy and detection over other methods. Numerical testing across a wide range of applications demonstrates the computational efficiency and accuracy of this approach.
Load sharing in distributed real-time systems with state-change broadcasts

NASA Technical Reports Server (NTRS)

Shin, Kang G.; Chang, Yi-Chieh

1989-01-01

A decentralized dynamic load-sharing (LS) method based on state-change broadcasts is proposed for a distributed real-time system. Whenever the state of a node changes from underloaded to fully loaded and vice versa, the node broadcasts this change to a set of nodes, called a buddy set, in the system. The performance of the method is evaluated with both analytic modeling and simulation. It is modeled first by an embedded Markov chain for which numerical solutions are derived. The model solutions are then used to calculate the distribution of queue lengths at the nodes and the probability of meeting task deadlines. The analytical results show that buddy sets of 10 nodes outperform those of less than 10 nodes, and the incremental benefit gained from increasing the buddy set size beyond 15 nodes is insignificant. These and other analytical results are verified by simulation. The proposed LS method is shown to meet task deadlines with a very high probability.
Probability distribution of haplotype frequencies under the two-locus Wright-Fisher model by diffusion approximation.

PubMed

Boitard, Simon; Loisel, Patrice

2007-05-01

The probability distribution of haplotype frequencies in a population, and the way it is influenced by genetical forces such as recombination, selection, random drift ...is a question of fundamental interest in population genetics. For large populations, the distribution of haplotype frequencies for two linked loci under the classical Wright-Fisher model is almost impossible to compute because of numerical reasons. However the Wright-Fisher process can in such cases be approximated by a diffusion process and the transition density can then be deduced from the Kolmogorov equations. As no exact solution has been found for these equations, we developed a numerical method based on finite differences to solve them. It applies to transient states and models including selection or mutations. We show by several tests that this method is accurate for computing the conditional joint density of haplotype frequencies given that no haplotype has been lost. We also prove that it is far less time consuming than other methods such as Monte Carlo simulations.
A Comparative Study of Probability Collectives Based Multi-agent Systems and Genetic Algorithms

NASA Technical Reports Server (NTRS)

Huang, Chien-Feng; Wolpert, David H.; Bieniawski, Stefan; Strauss, Charles E. M.

2005-01-01

We compare Genetic Algorithms (GA's) with Probability Collectives (PC), a new framework for distributed optimization and control. In contrast to GA's, PC-based methods do not update populations of solutions. Instead they update an explicitly parameterized probability distribution p over the space of solutions. That updating of p arises as the optimization of a functional of p. The functional is chosen so that any p that optimizes it should be p peaked about good solutions. The PC approach works in both continuous and discrete problems. It does not suffer from the resolution limitation of the finite bit length encoding of parameters into GA alleles. It also has deep connections with both game theory and statistical physics. We review the PC approach using its motivation as the information theoretic formulation of bounded rationality for multi-agent systems. It is then compared with GA's on a diverse set of problems. To handle high dimensional surfaces, in the PC method investigated here p is restricted to a product distribution. Each distribution in that product is controlled by a separate agent. The test functions were selected for their difficulty using either traditional gradient descent or genetic algorithms. On those functions the PC-based approach significantly outperforms traditional GA's in both rate of descent, trapping in false minima, and long term optimization.
Influence of item distribution pattern and abundance on efficiency of benthic core sampling

USGS Publications Warehouse

Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.

2014-01-01

ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
On the objective identification of flood seasons

NASA Astrophysics Data System (ADS)

Cunderlik, Juraj M.; Ouarda, Taha B. M. J.; BobéE, Bernard

2004-01-01

The determination of seasons of high and low probability of flood occurrence is a task with many practical applications in contemporary hydrology and water resources management. Flood seasons are generally identified subjectively by visually assessing the temporal distribution of flood occurrences and, then at a regional scale, verified by comparing the temporal distribution with distributions obtained at hydrologically similar neighboring sites. This approach is subjective, time consuming, and potentially unreliable. The main objective of this study is therefore to introduce a new, objective, and systematic method for the identification of flood seasons. The proposed method tests the significance of flood seasons by comparing the observed variability of flood occurrences with the theoretical flood variability in a nonseasonal model. The method also addresses the uncertainty resulting from sampling variability by quantifying the probability associated with the identified flood seasons. The performance of the method was tested on an extensive number of samples with different record lengths generated from several theoretical models of flood seasonality. The proposed approach was then applied on real data from a large set of sites with different flood regimes across Great Britain. The results show that the method can efficiently identify flood seasons from both theoretical and observed distributions of flood occurrence. The results were used for the determination of the main flood seasonality types in Great Britain.
Three statistical models for estimating length of stay.

PubMed Central

Selvin, S

1977-01-01

The probability density functions implied by three methods of collecting data on the length of stay in an institution are derived. The expected values associated with these density functions are used to calculate unbiased estimates of the expected length of stay. Two of the methods require an assumption about the form of the underlying distribution of length of stay; the third method does not. The three methods are illustrated with hypothetical data exhibiting the Poisson distribution, and the third (distribution-independent) method is used to estimate the length of stay in a skilled nursing facility and in an intermediate care facility for patients enrolled in California's MediCal program. PMID:914532
Three statistical models for estimating length of stay.

PubMed

Selvin, S

1977-01-01

The probability density functions implied by three methods of collecting data on the length of stay in an institution are derived. The expected values associated with these density functions are used to calculate unbiased estimates of the expected length of stay. Two of the methods require an assumption about the form of the underlying distribution of length of stay; the third method does not. The three methods are illustrated with hypothetical data exhibiting the Poisson distribution, and the third (distribution-independent) method is used to estimate the length of stay in a skilled nursing facility and in an intermediate care facility for patients enrolled in California's MediCal program.
Modeling potential distribution of Oligoryzomys longicaudatus, the Andes virus (Genus: Hantavirus) reservoir, in Argentina.

PubMed

Andreo, Verónica; Glass, Gregory; Shields, Timothy; Provensal, Cecilia; Polop, Jaime

2011-09-01

We constructed a model to predict the potential distribution of Oligoryzomys longicaudatus, the reservoir of Andes virus (Genus: Hantavirus), in Argentina. We developed an extensive database of occurrence records from published studies and our own surveys and compared two methods to model the probability of O. longicaudatus presence; logistic regression and MaxEnt algorithm. The environmental variables used were tree, grass and bare soil cover from MODIS imagery and, altitude and 19 bioclimatic variables from WorldClim database. The models performances were evaluated and compared both by threshold dependent and independent measures. The best models included tree and grass cover, mean diurnal temperature range, and precipitation of the warmest and coldest seasons. The potential distribution maps for O. longicaudatus predicted the highest occurrence probabilities along the Andes range, from 32°S and narrowing southwards. They also predicted high probabilities for the south-central area of Argentina, reaching the Atlantic coast. The Hantavirus Pulmonary Syndrome cases coincided with mean occurrence probabilities of 95 and 77% for logistic and MaxEnt models, respectively. HPS transmission zones in Argentine Patagonia matched the areas with the highest probability of presence. Therefore, colilargos presence probability may provide an approximate risk of transmission and act as an early tool to guide control and prevention plans.
Interpolating Non-Parametric Distributions of Hourly Rainfall Intensities Using Random Mixing

NASA Astrophysics Data System (ADS)

Mosthaf, Tobias; Bárdossy, András; Hörning, Sebastian

2015-04-01

The correct spatial interpolation of hourly rainfall intensity distributions is of great importance for stochastical rainfall models. Poorly interpolated distributions may lead to over- or underestimation of rainfall and consequently to wrong estimates of following applications, like hydrological or hydraulic models. By analyzing the spatial relation of empirical rainfall distribution functions, a persistent order of the quantile values over a wide range of non-exceedance probabilities is observed. As the order remains similar, the interpolation weights of quantile values for one certain non-exceedance probability can be applied to the other probabilities. This assumption enables the use of kernel smoothed distribution functions for interpolation purposes. Comparing the order of hourly quantile values over different gauges with the order of their daily quantile values for equal probabilities, results in high correlations. The hourly quantile values also show high correlations with elevation. The incorporation of these two covariates into the interpolation is therefore tested. As only positive interpolation weights for the quantile values assure a monotonically increasing distribution function, the use of geostatistical methods like kriging is problematic. Employing kriging with external drift to incorporate secondary information is not applicable. Nonetheless, it would be fruitful to make use of covariates. To overcome this shortcoming, a new random mixing approach of spatial random fields is applied. Within the mixing process hourly quantile values are considered as equality constraints and correlations with elevation values are included as relationship constraints. To profit from the dependence of daily quantile values, distribution functions of daily gauges are used to set up lower equal and greater equal constraints at their locations. In this way the denser daily gauge network can be included in the interpolation of the hourly distribution functions. The applicability of this new interpolation procedure will be shown for around 250 hourly rainfall gauges in the German federal state of Baden-Württemberg. The performance of the random mixing technique within the interpolation is compared to applicable kriging methods. Additionally, the interpolation of kernel smoothed distribution functions is compared with the interpolation of fitted parametric distributions.
Crystallization of hard spheres revisited. I. Extracting kinetics and free energy landscape from forward flux sampling.

PubMed

Richard, David; Speck, Thomas

2018-03-28

We investigate the kinetics and the free energy landscape of the crystallization of hard spheres from a supersaturated metastable liquid though direct simulations and forward flux sampling. In this first paper, we describe and test two different ways to reconstruct the free energy barriers from the sampled steady state probability distribution of cluster sizes without sampling the equilibrium distribution. The first method is based on mean first passage times, and the second method is based on splitting probabilities. We verify both methods for a single particle moving in a double-well potential. For the nucleation of hard spheres, these methods allow us to probe a wide range of supersaturations and to reconstruct the kinetics and the free energy landscape from the same simulation. Results are consistent with the scaling predicted by classical nucleation theory although a quantitative fit requires a rather large effective interfacial tension.
Crystallization of hard spheres revisited. I. Extracting kinetics and free energy landscape from forward flux sampling

NASA Astrophysics Data System (ADS)

Richard, David; Speck, Thomas

2018-03-01

We investigate the kinetics and the free energy landscape of the crystallization of hard spheres from a supersaturated metastable liquid though direct simulations and forward flux sampling. In this first paper, we describe and test two different ways to reconstruct the free energy barriers from the sampled steady state probability distribution of cluster sizes without sampling the equilibrium distribution. The first method is based on mean first passage times, and the second method is based on splitting probabilities. We verify both methods for a single particle moving in a double-well potential. For the nucleation of hard spheres, these methods allow us to probe a wide range of supersaturations and to reconstruct the kinetics and the free energy landscape from the same simulation. Results are consistent with the scaling predicted by classical nucleation theory although a quantitative fit requires a rather large effective interfacial tension.

A Performance Comparison on the Probability Plot Correlation Coefficient Test using Several Plotting Positions for GEV Distribution.

NASA Astrophysics Data System (ADS)

Ahn, Hyunjun; Jung, Younghun; Om, Ju-Seong; Heo, Jun-Haeng

2014-05-01

It is very important to select the probability distribution in Statistical hydrology. Goodness of fit test is a statistical method that selects an appropriate probability model for a given data. The probability plot correlation coefficient (PPCC) test as one of the goodness of fit tests was originally developed for normal distribution. Since then, this test has been widely applied to other probability models. The PPCC test is known as one of the best goodness of fit test because it shows higher rejection powers among them. In this study, we focus on the PPCC tests for the GEV distribution which is widely used in the world. For the GEV model, several plotting position formulas are suggested. However, the PPCC statistics are derived only for the plotting position formulas (Goel and De, In-na and Nguyen, and Kim et al.) in which the skewness coefficient (or shape parameter) are included. And then the regression equations are derived as a function of the shape parameter and sample size for a given significance level. In addition, the rejection powers of these formulas are compared using Monte-Carlo simulation. Keywords: Goodness-of-fit test, Probability plot correlation coefficient test, Plotting position, Monte-Carlo Simulation ACKNOWLEDGEMENTS This research was supported by a grant 'Establishing Active Disaster Management System of Flood Control Structures by using 3D BIM Technique' [NEMA-12-NH-57] from the Natural Hazard Mitigation Research Group, National Emergency Management Agency of Korea.
Method and device for landing aircraft dependent on runway occupancy time

NASA Technical Reports Server (NTRS)

Ghalebsaz Jeddi, Babak (Inventor)

2012-01-01

A technique for landing aircraft using an aircraft landing accident avoidance device is disclosed. The technique includes determining at least two probability distribution functions; determining a safe lower limit on a separation between a lead aircraft and a trail aircraft on a glide slope to the runway; determining a maximum sustainable safe attempt-to-land rate on the runway based on the safe lower limit and the probability distribution functions; directing the trail aircraft to enter the glide slope with a target separation from the lead aircraft corresponding to the maximum sustainable safe attempt-to-land rate; while the trail aircraft is in the glide slope, determining an actual separation between the lead aircraft and the trail aircraft; and directing the trail aircraft to execute a go-around maneuver if the actual separation approaches the safe lower limit. Probability distribution functions include runway occupancy time, and landing time interval and/or inter-arrival distance.
Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies

PubMed Central

Theis, Fabian J.

2017-01-01

Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464
Probabilistic analysis of preload in the abutment screw of a dental implant complex.

PubMed

Guda, Teja; Ross, Thomas A; Lang, Lisa A; Millwater, Harry R

2008-09-01

Screw loosening is a problem for a percentage of implants. A probabilistic analysis to determine the cumulative probability distribution of the preload, the probability of obtaining an optimal preload, and the probabilistic sensitivities identifying important variables is lacking. The purpose of this study was to examine the inherent variability of material properties, surface interactions, and applied torque in an implant system to determine the probability of obtaining desired preload values and to identify the significant variables that affect the preload. Using software programs, an abutment screw was subjected to a tightening torque and the preload was determined from finite element (FE) analysis. The FE model was integrated with probabilistic analysis software. Two probabilistic analysis methods (advanced mean value and Monte Carlo sampling) were applied to determine the cumulative distribution function (CDF) of preload. The coefficient of friction, elastic moduli, Poisson's ratios, and applied torque were modeled as random variables and defined by probability distributions. Separate probability distributions were determined for the coefficient of friction in well-lubricated and dry environments. The probabilistic analyses were performed and the cumulative distribution of preload was determined for each environment. A distinct difference was seen between the preload probability distributions generated in a dry environment (normal distribution, mean (SD): 347 (61.9) N) compared to a well-lubricated environment (normal distribution, mean (SD): 616 (92.2) N). The probability of obtaining a preload value within the target range was approximately 54% for the well-lubricated environment and only 0.02% for the dry environment. The preload is predominately affected by the applied torque and coefficient of friction between the screw threads and implant bore at lower and middle values of the preload CDF, and by the applied torque and the elastic modulus of the abutment screw at high values of the preload CDF. Lubrication at the threaded surfaces between the abutment screw and implant bore affects the preload developed in the implant complex. For the well-lubricated surfaces, only approximately 50% of implants will have preload values within the generally accepted range. This probability can be improved by applying a higher torque than normally recommended or a more closely controlled torque than typically achieved. It is also suggested that materials with higher elastic moduli be used in the manufacture of the abutment screw to achieve a higher preload.
Probabilistic structural analysis methods and applications

NASA Technical Reports Server (NTRS)

Cruse, T. A.; Wu, Y.-T.; Dias, B.; Rajagopal, K. R.

1988-01-01

An advanced algorithm for simulating the probabilistic distribution of structural responses due to statistical uncertainties in loads, geometry, material properties, and boundary conditions is reported. The method effectively combines an advanced algorithm for calculating probability levels for multivariate problems (fast probability integration) together with a general-purpose finite-element code for stress, vibration, and buckling analysis. Application is made to a space propulsion system turbine blade for which the geometry and material properties are treated as random variables.
Estimating true human and animal host source contribution in quantitative microbial source tracking using the Monte Carlo method.

PubMed

Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan

2010-09-01

Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
The Mean Distance to the nth Neighbour in a Uniform Distribution of Random Points: An Application of Probability Theory

ERIC Educational Resources Information Center

Bhattacharyya, Pratip; Chakrabarti, Bikas K.

2008-01-01

We study different ways of determining the mean distance (r[subscript n]) between a reference point and its nth neighbour among random points distributed with uniform density in a D-dimensional Euclidean space. First, we present a heuristic method; though this method provides only a crude mathematical result, it shows a simple way of estimating…
Application of the Bootstrap Statistical Method in Deriving Vibroacoustic Specifications

NASA Technical Reports Server (NTRS)

Hughes, William O.; Paez, Thomas L.

2006-01-01

This paper discusses the Bootstrap Method for specification of vibroacoustic test specifications. Vibroacoustic test specifications are necessary to properly accept or qualify a spacecraft and its components for the expected acoustic, random vibration and shock environments seen on an expendable launch vehicle. Traditionally, NASA and the U.S. Air Force have employed methods of Normal Tolerance Limits to derive these test levels based upon the amount of data available, and the probability and confidence levels desired. The Normal Tolerance Limit method contains inherent assumptions about the distribution of the data. The Bootstrap is a distribution-free statistical subsampling method which uses the measured data themselves to establish estimates of statistical measures of random sources. This is achieved through the computation of large numbers of Bootstrap replicates of a data measure of interest and the use of these replicates to derive test levels consistent with the probability and confidence desired. The comparison of the results of these two methods is illustrated via an example utilizing actual spacecraft vibroacoustic data.
On the quantification and efficient propagation of imprecise probabilities resulting from small datasets

NASA Astrophysics Data System (ADS)

Zhang, Jiaxin; Shields, Michael D.

2018-01-01

This paper addresses the problem of uncertainty quantification and propagation when data for characterizing probability distributions are scarce. We propose a methodology wherein the full uncertainty associated with probability model form and parameter estimation are retained and efficiently propagated. This is achieved by applying the information-theoretic multimodel inference method to identify plausible candidate probability densities and associated probabilities that each method is the best model in the Kullback-Leibler sense. The joint parameter densities for each plausible model are then estimated using Bayes' rule. We then propagate this full set of probability models by estimating an optimal importance sampling density that is representative of all plausible models, propagating this density, and reweighting the samples according to each of the candidate probability models. This is in contrast with conventional methods that try to identify a single probability model that encapsulates the full uncertainty caused by lack of data and consequently underestimate uncertainty. The result is a complete probabilistic description of both aleatory and epistemic uncertainty achieved with several orders of magnitude reduction in computational cost. It is shown how the model can be updated to adaptively accommodate added data and added candidate probability models. The method is applied for uncertainty analysis of plate buckling strength where it is demonstrated how dataset size affects the confidence (or lack thereof) we can place in statistical estimates of response when data are lacking.
Predictive probability methods for interim monitoring in clinical trials with longitudinal outcomes.

PubMed

Zhou, Ming; Tang, Qi; Lang, Lixin; Xing, Jun; Tatsuoka, Kay

2018-04-17

In clinical research and development, interim monitoring is critical for better decision-making and minimizing the risk of exposing patients to possible ineffective therapies. For interim futility or efficacy monitoring, predictive probability methods are widely adopted in practice. Those methods have been well studied for univariate variables. However, for longitudinal studies, predictive probability methods using univariate information from only completers may not be most efficient, and data from on-going subjects can be utilized to improve efficiency. On the other hand, leveraging information from on-going subjects could allow an interim analysis to be potentially conducted once a sufficient number of subjects reach an earlier time point. For longitudinal outcomes, we derive closed-form formulas for predictive probabilities, including Bayesian predictive probability, predictive power, and conditional power and also give closed-form solutions for predictive probability of success in a future trial and the predictive probability of success of the best dose. When predictive probabilities are used for interim monitoring, we study their distributions and discuss their analytical cutoff values or stopping boundaries that have desired operating characteristics. We show that predictive probabilities utilizing all longitudinal information are more efficient for interim monitoring than that using information from completers only. To illustrate their practical application for longitudinal data, we analyze 2 real data examples from clinical trials. Copyright © 2018 John Wiley & Sons, Ltd.
Stochastic optimal operation of reservoirs based on copula functions

NASA Astrophysics Data System (ADS)

Lei, Xiao-hui; Tan, Qiao-feng; Wang, Xu; Wang, Hao; Wen, Xin; Wang, Chao; Zhang, Jing-wen

2018-02-01

Stochastic dynamic programming (SDP) has been widely used to derive operating policies for reservoirs considering streamflow uncertainties. In SDP, there is a need to calculate the transition probability matrix more accurately and efficiently in order to improve the economic benefit of reservoir operation. In this study, we proposed a stochastic optimization model for hydropower generation reservoirs, in which 1) the transition probability matrix was calculated based on copula functions; and 2) the value function of the last period was calculated by stepwise iteration. Firstly, the marginal distribution of stochastic inflow in each period was built and the joint distributions of adjacent periods were obtained using the three members of the Archimedean copulas, based on which the conditional probability formula was derived. Then, the value in the last period was calculated by a simple recursive equation with the proposed stepwise iteration method and the value function was fitted with a linear regression model. These improvements were incorporated into the classic SDP and applied to the case study in Ertan reservoir, China. The results show that the transition probability matrix can be more easily and accurately obtained by the proposed copula function based method than conventional methods based on the observed or synthetic streamflow series, and the reservoir operation benefit can also be increased.
Quantum cryptographic system with reduced data loss

DOEpatents

Lo, H.K.; Chau, H.F.

1998-03-24

A secure method for distributing a random cryptographic key with reduced data loss is disclosed. Traditional quantum key distribution systems employ similar probabilities for the different communication modes and thus reject at least half of the transmitted data. The invention substantially reduces the amount of discarded data (those that are encoded and decoded in different communication modes e.g. using different operators) in quantum key distribution without compromising security by using significantly different probabilities for the different communication modes. Data is separated into various sets according to the actual operators used in the encoding and decoding process and the error rate for each set is determined individually. The invention increases the key distribution rate of the BB84 key distribution scheme proposed by Bennett and Brassard in 1984. Using the invention, the key distribution rate increases with the number of quantum signals transmitted and can be doubled asymptotically. 23 figs.
Quantum cryptographic system with reduced data loss

DOEpatents

Lo, Hoi-Kwong; Chau, Hoi Fung

1998-01-01

A secure method for distributing a random cryptographic key with reduced data loss. Traditional quantum key distribution systems employ similar probabilities for the different communication modes and thus reject at least half of the transmitted data. The invention substantially reduces the amount of discarded data (those that are encoded and decoded in different communication modes e.g. using different operators) in quantum key distribution without compromising security by using significantly different probabilities for the different communication modes. Data is separated into various sets according to the actual operators used in the encoding and decoding process and the error rate for each set is determined individually. The invention increases the key distribution rate of the BB84 key distribution scheme proposed by Bennett and Brassard in 1984. Using the invention, the key distribution rate increases with the number of quantum signals transmitted and can be doubled asymptotically.
Maritime Search and Rescue via Multiple Coordinated UAS

DTIC Science & Technology

2016-01-01

partitioning method uses the underlying probability distribution assumptions to place that probability near the geometric center of the partitions. There...During partitioning the known locations are accommodated, but the unaccounted for objects are placed into geometrically unfavorable conditions. The...Zeitlin, A.D.: UAS Sence and Avoid Develop- ment - the Challenges of Technology, Standards, and Certification. Aerospace Sciences Meeting including
Uncertainty in determining extreme precipitation thresholds

NASA Astrophysics Data System (ADS)

Liu, Bingjun; Chen, Junfan; Chen, Xiaohong; Lian, Yanqing; Wu, Lili

2013-10-01

Extreme precipitation events are rare and occur mostly on a relatively small and local scale, which makes it difficult to set the thresholds for extreme precipitations in a large basin. Based on the long term daily precipitation data from 62 observation stations in the Pearl River Basin, this study has assessed the applicability of the non-parametric, parametric, and the detrended fluctuation analysis (DFA) methods in determining extreme precipitation threshold (EPT) and the certainty to EPTs from each method. Analyses from this study show the non-parametric absolute critical value method is easy to use, but unable to reflect the difference of spatial rainfall distribution. The non-parametric percentile method can account for the spatial distribution feature of precipitation, but the problem with this method is that the threshold value is sensitive to the size of rainfall data series and is subjected to the selection of a percentile thus make it difficult to determine reasonable threshold values for a large basin. The parametric method can provide the most apt description of extreme precipitations by fitting extreme precipitation distributions with probability distribution functions; however, selections of probability distribution functions, the goodness-of-fit tests, and the size of the rainfall data series can greatly affect the fitting accuracy. In contrast to the non-parametric and the parametric methods which are unable to provide information for EPTs with certainty, the DFA method although involving complicated computational processes has proven to be the most appropriate method that is able to provide a unique set of EPTs for a large basin with uneven spatio-temporal precipitation distribution. The consistency between the spatial distribution of DFA-based thresholds with the annual average precipitation, the coefficient of variation (CV), and the coefficient of skewness (CS) for the daily precipitation further proves that EPTs determined by the DFA method are more reasonable and applicable for the Pearl River Basin.
New approach in bivariate drought duration and severity analysis

NASA Astrophysics Data System (ADS)

Montaseri, Majid; Amirataee, Babak; Rezaie, Hossein

2018-04-01

The copula functions have been widely applied as an advance technique to create joint probability distribution of drought duration and severity. The approach of data collection as well as the amount of data and dispersion of data series can last a significant impact on creating such joint probability distribution using copulas. Usually, such traditional analyses have shed an Unconnected Drought Runs (UDR) approach towards droughts. In other word, droughts with different durations would be independent of each other. Emphasis on such data collection method causes the omission of actual potentials of short-term extreme droughts located within a long-term UDR. Meanwhile, traditional method is often faced with significant gap in drought data series. However, a long-term UDR can be approached as a combination of short-term Connected Drought Runs (CDR). Therefore this study aims to evaluate systematically two UDR and CDR procedures in joint probability of drought duration and severity investigations. For this purpose, rainfall data (1971-2013) from 24 rain gauges in Lake Urmia basin, Iran were applied. Also, seven common univariate marginal distributions and seven types of bivariate copulas were examined. Compared to traditional approach, the results demonstrated a significant comparative advantage of the new approach. Such comparative advantages led to determine the correct copula function, more accurate estimation of copula parameter, more realistic estimation of joint/conditional probabilities of drought duration and severity and significant reduction in uncertainty for modeling.
Velocity distributions among colliding asteroids

NASA Technical Reports Server (NTRS)

Bottke, William F., Jr.; Nolan, Michael C.; Greenberg, Richard; Kolvoord, Robert A.

1994-01-01

The probability distribution for impact velocities between two given asteroids is wide, non-Gaussian, and often contains spikes according to our new method of analysis in which each possible orbital geometry for collision is weighted according to its probability. An average value would give a good representation only if the distribution were smooth and narrow. Therefore, the complete velocity distribution we obtain for various asteroid populations differs significantly from published histograms of average velocities. For all pairs among the 682 asteroids in the main-belt with D greater than 50 km, we find that our computed velocity distribution is much wider than previously computed histograms of average velocities. In this case, the most probable impact velocity is approximately 4.4 km/sec, compared with the mean impact velocity of 5.3 km/sec. For cases of a single asteroid (e.g., Gaspra or Ida) relative to an impacting population, the distribution we find yields lower velocities than previously reported by others. The width of these velocity distributions implies that mean impact velocities must be used with caution when calculating asteroid collisional lifetimes or crater-size distributions. Since the most probable impact velocities are lower than the mean, disruption events may occur less frequently than previously estimated. However, this disruption rate may be balanced somewhat by an apparent increase in the frequency of high-velocity impacts between asteroids. These results have implications for issues such as asteroidal disruption rates, the amount/type of impact ejecta available for meteoritical delivery to the Earth, and the geology and evolution of specific asteroids like Gaspra.
Learning Probabilities From Random Observables in High Dimensions: The Maximum Entropy Distribution and Others

NASA Astrophysics Data System (ADS)

Obuchi, Tomoyuki; Cocco, Simona; Monasson, Rémi

2015-11-01

We consider the problem of learning a target probability distribution over a set of N binary variables from the knowledge of the expectation values (with this target distribution) of M observables, drawn uniformly at random. The space of all probability distributions compatible with these M expectation values within some fixed accuracy, called version space, is studied. We introduce a biased measure over the version space, which gives a boost increasing exponentially with the entropy of the distributions and with an arbitrary inverse `temperature' Γ . The choice of Γ allows us to interpolate smoothly between the unbiased measure over all distributions in the version space (Γ =0) and the pointwise measure concentrated at the maximum entropy distribution (Γ → ∞ ). Using the replica method we compute the volume of the version space and other quantities of interest, such as the distance R between the target distribution and the center-of-mass distribution over the version space, as functions of α =(log M)/N and Γ for large N. Phase transitions at critical values of α are found, corresponding to qualitative improvements in the learning of the target distribution and to the decrease of the distance R. However, for fixed α the distance R does not vary with Γ which means that the maximum entropy distribution is not closer to the target distribution than any other distribution compatible with the observable values. Our results are confirmed by Monte Carlo sampling of the version space for small system sizes (N≤ 10).
Habitat suitability criteria via parametric distributions: estimation, model selection and uncertainty

USGS Publications Warehouse

Som, Nicholas A.; Goodman, Damon H.; Perry, Russell W.; Hardy, Thomas B.

2016-01-01

Previous methods for constructing univariate habitat suitability criteria (HSC) curves have ranged from professional judgement to kernel-smoothed density functions or combinations thereof. We present a new method of generating HSC curves that applies probability density functions as the mathematical representation of the curves. Compared with previous approaches, benefits of our method include (1) estimation of probability density function parameters directly from raw data, (2) quantitative methods for selecting among several candidate probability density functions, and (3) concise methods for expressing estimation uncertainty in the HSC curves. We demonstrate our method with a thorough example using data collected on the depth of water used by juvenile Chinook salmon (Oncorhynchus tschawytscha) in the Klamath River of northern California and southern Oregon. All R code needed to implement our example is provided in the appendix. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
A statistical treatment of bioassay pour fractions

NASA Astrophysics Data System (ADS)

Barengoltz, Jack; Hughes, David

A bioassay is a method for estimating the number of bacterial spores on a spacecraft surface for the purpose of demonstrating compliance with planetary protection (PP) requirements (Ref. 1). The details of the process may be seen in the appropriate PP document (e.g., for NASA, Ref. 2). In general, the surface is mechanically sampled with a damp sterile swab or wipe. The completion of the process is colony formation in a growth medium in a plate (Petri dish); the colonies are counted. Consider a set of samples from randomly selected, known areas of one spacecraft surface, for simplicity. One may calculate the mean and standard deviation of the bioburden density, which is the ratio of counts to area sampled. The standard deviation represents an estimate of the variation from place to place of the true bioburden density commingled with the precision of the individual sample counts. The accuracy of individual sample results depends on the equipment used, the collection method, and the culturing method. One aspect that greatly influences the result is the pour fraction, which is the quantity of fluid added to the plates divided by the total fluid used in extracting spores from the sampling equipment. In an analysis of a single sample’s counts due to the pour fraction, one seeks to answer the question: What is the probability that if a certain number of spores are counted with a known pour fraction, that there are an additional number of spores in the part of the rinse not poured. This is given for specific values by the binomial distribution density, where detection (of culturable spores) is success and the probability of success is the pour fraction. A special summation over the binomial distribution, equivalent to adding for all possible values of the true total number of spores, is performed. This distribution when normalized will almost yield the desired quantity. It is the probability that the additional number of spores does not exceed a certain value. Of course, for a desired value of uncertainty, one must invert the calculation. However, this probability of finding exactly the number of spores in the poured part is correct only in the case where all values of the true number of spores greater than or equal to the adjusted count are equally probable. This is not realistic, of course, but the result can only overestimate the uncertainty. So it is useful. In probability speak, one has the conditional probability given any true total number of spores. Therefore one must multiply it by the probability of each possible true count, before the summation. If the counts for a sample set (of which this is one sample) are available, one may use the calculated variance and the normal probability distribution. In this approach, one assumes a normal distribution and neglects the contribution from spatial variation. The former is a common assumption. The latter can only add to the conservatism (over estimate the number of spores at some level of confidence). A more straightforward approach is to assume a Poisson probability distribution for the measured total sample set counts, and use the product of the number of samples and the mean number of counts per sample as the mean of the Poisson distribution. It is necessary to set the total count to 1 in the Poisson distribution when actual total count is zero. Finally, even when the planetary protection requirements for spore burden refer only to the mean values, they require an adjustment for pour fraction and method efficiency (a PP specification based on independent data). The adjusted mean values are a 50/50 proposition (e.g., the probability of the true total counts in the sample set exceeding the estimate is 0.50). However, this is highly unconservative when the total counts are zero. No adjustment to the mean values occurs for either pour fraction or efficiency. The recommended approach is once again to set the total counts to 1, but now applied to the mean values. Then one may apply the corrections to the revised counts. It can be shown by the methods developed in this work that this change is usually conservative enough to increase the level of confidence in the estimate to 0.5. 1. NASA. (2005) Planetary protection provisions for robotic extraterrestrial missions. NPR 8020.12C, April 2005, National Aeronautics and Space Administration, Washington, DC. 2. NASA. (2010) Handbook for the Microbiological Examination of Space Hardware, NASA-HDBK-6022, National Aeronautics and Space Administration, Washington, DC.

Probability theory versus simulation of petroleum potential in play analysis

USGS Publications Warehouse

Crovelli, R.A.

1987-01-01

An analytic probabilistic methodology for resource appraisal of undiscovered oil and gas resources in play analysis is presented. This play-analysis methodology is a geostochastic system for petroleum resource appraisal in explored as well as frontier areas. An objective was to replace an existing Monte Carlo simulation method in order to increase the efficiency of the appraisal process. Underlying the two methods is a single geologic model which considers both the uncertainty of the presence of the assessed hydrocarbon and its amount if present. The results of the model are resource estimates of crude oil, nonassociated gas, dissolved gas, and gas for a geologic play in terms of probability distributions. The analytic method is based upon conditional probability theory and a closed form solution of all means and standard deviations, along with the probabilities of occurrence. ?? 1987 J.C. Baltzer A.G., Scientific Publishing Company.
Radar prediction of absolute rain fade distributions for earth-satellite paths and general methods for extrapolation of fade statistics to other locations

NASA Technical Reports Server (NTRS)

Goldhirsh, J.

1982-01-01

The first absolute rain fade distribution method described establishes absolute fade statistics at a given site by means of a sampled radar data base. The second method extrapolates absolute fade statistics from one location to another, given simultaneously measured fade and rain rate statistics at the former. Both methods employ similar conditional fade statistic concepts and long term rain rate distributions. Probability deviations in the 2-19% range, with an 11% average, were obtained upon comparison of measured and predicted levels at given attenuations. The extrapolation of fade distributions to other locations at 28 GHz showed very good agreement with measured data at three sites located in the continental temperate region.
Robust optimization based upon statistical theory.

PubMed

Sobotta, B; Söhn, M; Alber, M

2010-08-01

Organ movement is still the biggest challenge in cancer treatment despite advances in online imaging. Due to the resulting geometric uncertainties, the delivered dose cannot be predicted precisely at treatment planning time. Consequently, all associated dose metrics (e.g., EUD and maxDose) are random variables with a patient-specific probability distribution. The method that the authors propose makes these distributions the basis of the optimization and evaluation process. The authors start from a model of motion derived from patient-specific imaging. On a multitude of geometry instances sampled from this model, a dose metric is evaluated. The resulting pdf of this dose metric is termed outcome distribution. The approach optimizes the shape of the outcome distribution based on its mean and variance. This is in contrast to the conventional optimization of a nominal value (e.g., PTV EUD) computed on a single geometry instance. The mean and variance allow for an estimate of the expected treatment outcome along with the residual uncertainty. Besides being applicable to the target, the proposed method also seamlessly includes the organs at risk (OARs). The likelihood that a given value of a metric is reached in the treatment is predicted quantitatively. This information reveals potential hazards that may occur during the course of the treatment, thus helping the expert to find the right balance between the risk of insufficient normal tissue sparing and the risk of insufficient tumor control. By feeding this information to the optimizer, outcome distributions can be obtained where the probability of exceeding a given OAR maximum and that of falling short of a given target goal can be minimized simultaneously. The method is applicable to any source of residual motion uncertainty in treatment delivery. Any model that quantifies organ movement and deformation in terms of probability distributions can be used as basis for the algorithm. Thus, it can generate dose distributions that are robust against interfraction and intrafraction motion alike, effectively removing the need for indiscriminate safety margins.
Neyman Pearson detection of K-distributed random variables

NASA Astrophysics Data System (ADS)

Tucker, J. Derek; Azimi-Sadjadi, Mahmood R.

2010-04-01

In this paper a new detection method for sonar imagery is developed in K-distributed background clutter. The equation for the log-likelihood is derived and compared to the corresponding counterparts derived for the Gaussian and Rayleigh assumptions. Test results of the proposed method on a data set of synthetic underwater sonar images is also presented. This database contains images with targets of different shapes inserted into backgrounds generated using a correlated K-distributed model. Results illustrating the effectiveness of the K-distributed detector are presented in terms of probability of detection, false alarm, and correct classification rates for various bottom clutter scenarios.
Complete Numerical Solution of the Diffusion Equation of Random Genetic Drift

PubMed Central

Zhao, Lei; Yue, Xingye; Waxman, David

2013-01-01

A numerical method is presented to solve the diffusion equation for the random genetic drift that occurs at a single unlinked locus with two alleles. The method was designed to conserve probability, and the resulting numerical solution represents a probability distribution whose total probability is unity. We describe solutions of the diffusion equation whose total probability is unity as complete. Thus the numerical method introduced in this work produces complete solutions, and such solutions have the property that whenever fixation and loss can occur, they are automatically included within the solution. This feature demonstrates that the diffusion approximation can describe not only internal allele frequencies, but also the boundary frequencies zero and one. The numerical approach presented here constitutes a single inclusive framework from which to perform calculations for random genetic drift. It has a straightforward implementation, allowing it to be applied to a wide variety of problems, including those with time-dependent parameters, such as changing population sizes. As tests and illustrations of the numerical method, it is used to determine: (i) the probability density and time-dependent probability of fixation for a neutral locus in a population of constant size; (ii) the probability of fixation in the presence of selection; and (iii) the probability of fixation in the presence of selection and demographic change, the latter in the form of a changing population size. PMID:23749318
A contemporary approach to the problem of determining physical parameters according to the results of measurements

NASA Technical Reports Server (NTRS)

Elyasberg, P. Y.

1979-01-01

The shortcomings of the classical approach are set forth, and the newer methods resulting from these shortcomings are explained. The problem was approached with the assumption that the probabilities of error were known, as well as without knowledge of the distribution of the probabilities of error. The advantages of the newer approach are discussed.
Markov Chain Monte Carlo estimation of species distributions: a case study of the swift fox in western Kansas

USGS Publications Warehouse

Sargeant, Glen A.; Sovada, Marsha A.; Slivinski, Christiane C.; Johnson, Douglas H.

2005-01-01

Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997–1999, we searched 355 townships (ca. 93 km) 1–3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ≥1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between species and habitats. Requirements for the use of MCMC image restoration include study areas that can be partitioned into regular grids of mapping units, spatially contagious species distributions, reliable methods for identifying target species, and cumulative probabilities of detection ≥0.65.
Markov chain Monte Carlo estimation of species distributions: A case study of the swift fox in western Kansas

USGS Publications Warehouse

Sargeant, G.A.; Sovada, M.A.; Slivinski, C.C.; Johnson, D.H.

2005-01-01

Accurate maps of species distributions are essential tools for wildlife research and conservation. Unfortunately, biologists often are forced to rely on maps derived from observed occurrences recorded opportunistically during observation periods of variable length. Spurious inferences are likely to result because such maps are profoundly affected by the duration and intensity of observation and by methods used to delineate distributions, especially when detection is uncertain. We conducted a systematic survey of swift fox (Vulpes velox) distribution in western Kansas, USA, and used Markov chain Monte Carlo (MCMC) image restoration to rectify these problems. During 1997-1999, we searched 355 townships (ca. 93 km2) 1-3 times each for an average cost of $7,315 per year and achieved a detection rate (probability of detecting swift foxes, if present, during a single search) of ?? = 0.69 (95% Bayesian confidence interval [BCI] = [0.60, 0.77]). Our analysis produced an estimate of the underlying distribution, rather than a map of observed occurrences, that reflected the uncertainty associated with estimates of model parameters. To evaluate our results, we analyzed simulated data with similar properties. Results of our simulations suggest negligible bias and good precision when probabilities of detection on ???1 survey occasions (cumulative probabilities of detection) exceed 0.65. Although the use of MCMC image restoration has been limited by theoretical and computational complexities, alternatives do not possess the same advantages. Image models accommodate uncertain detection, do not require spatially independent data or a census of map units, and can be used to estimate species distributions directly from observations without relying on habitat covariates or parameters that must be estimated subjectively. These features facilitate economical surveys of large regions, the detection of temporal trends in distribution, and assessments of landscape-level relations between species and habitats. Requirements for the use of MCMC image restoration include study areas that can be partitioned into regular grids of mapping units, spatially contagious species distributions, reliable methods for identifying target species, and cumulative probabilities of detection ???0.65.
TU-AB-BRB-01: Coverage Evaluation and Probabilistic Treatment Planning as a Margin Alternative

DOE Office of Scientific and Technical Information (OSTI.GOV)

Siebers, J.

The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
TU-AB-BRB-03: Coverage-Based Treatment Planning to Accommodate Organ Deformable Motions and Contouring Uncertainties for Prostate Treatment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, H.

The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
Expected Utility Distributions for Flexible, Contingent Execution

NASA Technical Reports Server (NTRS)

Bresina, John L.; Washington, Richard

2000-01-01

This paper presents a method for using expected utility distributions in the execution of flexible, contingent plans. A utility distribution maps the possible start times of an action to the expected utility of the plan suffix starting with that action. The contingent plan encodes a tree of possible courses of action and includes flexible temporal constraints and resource constraints. When execution reaches a branch point, the eligible option with the highest expected utility at that point in time is selected. The utility distributions make this selection sensitive to the runtime context, yet still efficient. Our approach uses predictions of action duration uncertainty as well as expectations of resource usage and availability to determine when an action can execute and with what probability. Execution windows and probabilities inevitably change as execution proceeds, but such changes do not invalidate the cached utility distributions, thus, dynamic updating of utility information is minimized.
On fitting the Pareto Levy distribution to stock market index data: Selecting a suitable cutoff value

NASA Astrophysics Data System (ADS)

Coronel-Brizio, H. F.; Hernández-Montoya, A. R.

2005-08-01

The so-called Pareto-Levy or power-law distribution has been successfully used as a model to describe probabilities associated to extreme variations of stock markets indexes worldwide. The selection of the threshold parameter from empirical data and consequently, the determination of the exponent of the distribution, is often done using a simple graphical method based on a log-log scale, where a power-law probability plot shows a straight line with slope equal to the exponent of the power-law distribution. This procedure can be considered subjective, particularly with regard to the choice of the threshold or cutoff parameter. In this work, a more objective procedure based on a statistical measure of discrepancy between the empirical and the Pareto-Levy distribution is presented. The technique is illustrated for data sets from the New York Stock Exchange (DJIA) and the Mexican Stock Market (IPC).
Peelle's pertinent puzzle using the Monte Carlo technique

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kawano, Toshihiko; Talou, Patrick; Burr, Thomas

2009-01-01

We try to understand the long-standing problem of the Peelle's Pertinent Puzzle (PPP) using the Monte Carlo technique. We allow the probability density functions to be any kind of form to assume the impact of distribution, and obtain the least-squares solution directly from numerical simulations. We found that the standard least squares method gives the correct answer if a weighting function is properly provided. Results from numerical simulations show that the correct answer of PPP is 1.1 {+-} 0.25 if the common error is multiplicative. The thought-provoking answer of 0.88 is also correct, if the common error is additive, andmore » if the error is proportional to the measured values. The least squares method correctly gives us the most probable case, where the additive component has a negative value. Finally, the standard method fails for PPP due to a distorted (non Gaussian) joint distribution.« less
Visualization of the operational space of edge-localized modes through low-dimensional embedding of probability distributions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shabbir, A., E-mail: aqsa.shabbir@ugent.be; Noterdaeme, J. M.; Max-Planck-Institut für Plasmaphysik, Garching D-85748

2014-11-15

Information visualization aimed at facilitating human perception is an important tool for the interpretation of experiments on the basis of complex multidimensional data characterizing the operational space of fusion devices. This work describes a method for visualizing the operational space on a two-dimensional map and applies it to the discrimination of type I and type III edge-localized modes (ELMs) from a series of carbon-wall ELMy discharges at JET. The approach accounts for stochastic uncertainties that play an important role in fusion data sets, by modeling measurements with probability distributions in a metric space. The method is aimed at contributing tomore » physical understanding of ELMs as well as their control. Furthermore, it is a general method that can be applied to the modeling of various other plasma phenomena as well.« less
Estimation of descriptive statistics for multiply censored water quality data

USGS Publications Warehouse

Helsel, Dennis R.; Cohn, Timothy A.

1988-01-01

This paper extends the work of Gilliom and Helsel (1986) on procedures for estimating descriptive statistics of water quality data that contain “less than” observations. Previously, procedures were evaluated when only one detection limit was present. Here we investigate the performance of estimators for data that have multiple detection limits. Probability plotting and maximum likelihood methods perform substantially better than simple substitution procedures now commonly in use. Therefore simple substitution procedures (e.g., substitution of the detection limit) should be avoided. Probability plotting methods are more robust than maximum likelihood methods to misspecification of the parent distribution and their use should be encouraged in the typical situation where the parent distribution is unknown. When utilized correctly, less than values frequently contain nearly as much information for estimating population moments and quantiles as would the same observations had the detection limit been below them.
Comparison of probability statistics for automated ship detection in SAR imagery

NASA Astrophysics Data System (ADS)

Henschel, Michael D.; Rey, Maria T.; Campbell, J. W. M.; Petrovic, D.

1998-12-01

This paper discuses the initial results of a recent operational trial of the Ocean Monitoring Workstation's (OMW) ship detection algorithm which is essentially a Constant False Alarm Rate filter applied to Synthetic Aperture Radar data. The choice of probability distribution and methodologies for calculating scene specific statistics are discussed in some detail. An empirical basis for the choice of probability distribution used is discussed. We compare the results using a l-look, k-distribution function with various parameter choices and methods of estimation. As a special case of sea clutter statistics the application of a (chi) 2-distribution is also discussed. Comparisons are made with reference to RADARSAT data collected during the Maritime Command Operation Training exercise conducted in Atlantic Canadian Waters in June 1998. Reference is also made to previously collected statistics. The OMW is a commercial software suite that provides modules for automated vessel detection, oil spill monitoring, and environmental monitoring. This work has been undertaken to fine tune the OMW algorithm's, with special emphasis on the false alarm rate of each algorithm.
Estimation of submarine mass failure probability from a sequence of deposits with age dates

USGS Publications Warehouse

Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.

2013-01-01

The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Encoding of low-quality DNA profiles as genotype probability matrices for improved profile comparisons, relatedness evaluation and database searches.

PubMed

Ryan, K; Williams, D Gareth; Balding, David J

2016-11-01

Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Probabilistic performance estimators for computational chemistry methods: The empirical cumulative distribution function of absolute errors

NASA Astrophysics Data System (ADS)

Pernot, Pascal; Savin, Andreas

2018-06-01

Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.
Improved Measures of Integrated Information

PubMed Central

Tegmark, Max

2016-01-01

Although there is growing interest in measuring integrated information in computational and cognitive systems, current methods for doing so in practice are computationally unfeasible. Existing and novel integration measures are investigated and classified by various desirable properties. A simple taxonomy of Φ-measures is presented where they are each characterized by their choice of factorization method (5 options), choice of probability distributions to compare (3 × 4 options) and choice of measure for comparing probability distributions (7 options). When requiring the Φ-measures to satisfy a minimum of attractive properties, these hundreds of options reduce to a mere handful, some of which turn out to be identical. Useful exact and approximate formulas are derived that can be applied to real-world data from laboratory experiments without posing unreasonable computational demands. PMID:27870846

A New Self-Constrained Inversion Method of Potential Fields Based on Probability Tomography

NASA Astrophysics Data System (ADS)

Sun, S.; Chen, C.; WANG, H.; Wang, Q.

2014-12-01

The self-constrained inversion method of potential fields uses a priori information self-extracted from potential field data. Differing from external a priori information, the self-extracted information are generally parameters derived exclusively from the analysis of the gravity and magnetic data (Paoletti et al., 2013). Here we develop a new self-constrained inversion method based on probability tomography. Probability tomography doesn't need any priori information, as well as large inversion matrix operations. Moreover, its result can describe the sources, especially the distribution of which is complex and irregular, entirely and clearly. Therefore, we attempt to use the a priori information extracted from the probability tomography results to constrain the inversion for physical properties. The magnetic anomaly data was taken as an example in this work. The probability tomography result of magnetic total field anomaly(ΔΤ) shows a smoother distribution than the anomalous source and cannot display the source edges exactly. However, the gradients of ΔΤ are with higher resolution than ΔΤ in their own direction, and this characteristic is also presented in their probability tomography results. So we use some rules to combine the probability tomography results of ∂ΔΤ⁄∂x, ∂ΔΤ⁄∂y and ∂ΔΤ⁄∂z into a new result which is used for extracting a priori information, and then incorporate the information into the model objective function as spatial weighting functions to invert the final magnetic susceptibility. Some magnetic synthetic examples incorporated with and without a priori information extracted from the probability tomography results were made to do comparison, results of which show that the former are more concentrated and with higher resolution of the source body edges. This method is finally applied in an iron mine in China with field measured ΔΤ data and performs well. ReferencesPaoletti, V., Ialongo, S., Florio, G., Fedi, M. & Cella, F., 2013. Self-constrained inversion of potential fields, Geophys J Int.This research is supported by the Fundamental Research Funds for Institute for Geophysical and Geochemical Exploration, Chinese Academy of Geological Sciences (Grant Nos. WHS201210 and WHS201211).
Models of multidimensional discrete distribution of probabilities of random variables in information systems

NASA Astrophysics Data System (ADS)

Gromov, Yu Yu; Minin, Yu V.; Ivanova, O. G.; Morozova, O. N.

2018-03-01

Multidimensional discrete distributions of probabilities of independent random values were received. Their one-dimensional distribution is widely used in probability theory. Producing functions of those multidimensional distributions were also received.
Transmuted of Rayleigh Distribution with Estimation and Application on Noise Signal

NASA Astrophysics Data System (ADS)

Ahmed, Suhad; Qasim, Zainab

2018-05-01

This paper deals with transforming one parameter Rayleigh distribution, into transmuted probability distribution through introducing a new parameter (λ), since this studied distribution is necessary in representing signal data distribution and failure data model the value of this transmuted parameter |λ| ≤ 1, is also estimated as well as the original parameter (⊖) by methods of moments and maximum likelihood using different sample size (n=25, 50, 75, 100) and comparing the results of estimation by statistical measure (mean square error, MSE).
Dealing with non-unique and non-monotonic response in particle sizing instruments

NASA Astrophysics Data System (ADS)

Rosenberg, Phil

2017-04-01

A number of instruments used as de-facto standards for measuring particle size distributions are actually incapable of uniquely determining the size of an individual particle. This is due to non-unique or non-monotonic response functions. Optical particle counters have non monotonic response due to oscillations in the Mie response curves, especially for large aerosol and small cloud droplets. Scanning mobility particle sizers respond identically to two particles where the ratio of particle size to particle charge is approximately the same. Images of two differently sized cloud or precipitation particles taken by an optical array probe can have similar dimensions or shadowed area depending upon where they are in the imaging plane. A number of methods exist to deal with these issues, including assuming that positive and negative errors cancel, smoothing response curves, integrating regions in measurement space before conversion to size space and matrix inversion. Matrix inversion (also called kernel inversion) has the advantage that it determines the size distribution which best matches the observations, given specific information about the instrument (a matrix which specifies the probability that a particle of a given size will be measured in a given instrument size bin). In this way it maximises use of the information in the measurements. However this technique can be confused by poor counting statistics which can cause erroneous results and negative concentrations. Also an effective method for propagating uncertainties is yet to be published or routinely implemented. Her we present a new alternative which overcomes these issues. We use Bayesian methods to determine the probability that a given size distribution is correct given a set of instrument data and then we use Markov Chain Monte Carlo methods to sample this many dimensional probability distribution function to determine the expectation and (co)variances - hence providing a best guess and an uncertainty for the size distribution which includes contributions from the non-unique response curve, counting statistics and can propagate calibration uncertainties.
Probabilistic Models For Earthquakes With Large Return Periods In Himalaya Region

NASA Astrophysics Data System (ADS)

Chaudhary, Chhavi; Sharma, Mukat Lal

2017-12-01

Determination of the frequency of large earthquakes is of paramount importance for seismic risk assessment as large events contribute to significant fraction of the total deformation and these long return period events with low probability of occurrence are not easily captured by classical distributions. Generally, with a small catalogue these larger events follow different distribution function from the smaller and intermediate events. It is thus of special importance to use statistical methods that analyse as closely as possible the range of its extreme values or the tail of the distributions in addition to the main distributions. The generalised Pareto distribution family is widely used for modelling the events which are crossing a specified threshold value. The Pareto, Truncated Pareto, and Tapered Pareto are the special cases of the generalised Pareto family. In this work, the probability of earthquake occurrence has been estimated using the Pareto, Truncated Pareto, and Tapered Pareto distributions. As a case study, the Himalayas whose orogeny lies in generation of large earthquakes and which is one of the most active zones of the world, has been considered. The whole Himalayan region has been divided into five seismic source zones according to seismotectonic and clustering of events. Estimated probabilities of occurrence of earthquakes have also been compared with the modified Gutenberg-Richter distribution and the characteristics recurrence distribution. The statistical analysis reveals that the Tapered Pareto distribution better describes seismicity for the seismic source zones in comparison to other distributions considered in the present study.
Confidence Intervals for True Scores Using the Skew-Normal Distribution

ERIC Educational Resources Information Center

Garcia-Perez, Miguel A.

2010-01-01

A recent comparative analysis of alternative interval estimation approaches and procedures has shown that confidence intervals (CIs) for true raw scores determined with the Score method--which uses the normal approximation to the binomial distribution--have actual coverage probabilities that are closest to their nominal level. It has also recently…
Discriminating between Light- and Heavy-Tailed Distributions with Limit Theorem.

PubMed

Burnecki, Krzysztof; Wylomanska, Agnieszka; Chechkin, Aleksei

2015-01-01

In this paper we propose an algorithm to distinguish between light- and heavy-tailed probability laws underlying random datasets. The idea of the algorithm, which is visual and easy to implement, is to check whether the underlying law belongs to the domain of attraction of the Gaussian or non-Gaussian stable distribution by examining its rate of convergence. The method allows to discriminate between stable and various non-stable distributions. The test allows to differentiate between distributions, which appear the same according to standard Kolmogorov-Smirnov test. In particular, it helps to distinguish between stable and Student's t probability laws as well as between the stable and tempered stable, the cases which are considered in the literature as very cumbersome. Finally, we illustrate the procedure on plasma data to identify cases with so-called L-H transition.
Discriminating between Light- and Heavy-Tailed Distributions with Limit Theorem

PubMed Central

Chechkin, Aleksei

2015-01-01

In this paper we propose an algorithm to distinguish between light- and heavy-tailed probability laws underlying random datasets. The idea of the algorithm, which is visual and easy to implement, is to check whether the underlying law belongs to the domain of attraction of the Gaussian or non-Gaussian stable distribution by examining its rate of convergence. The method allows to discriminate between stable and various non-stable distributions. The test allows to differentiate between distributions, which appear the same according to standard Kolmogorov–Smirnov test. In particular, it helps to distinguish between stable and Student’s t probability laws as well as between the stable and tempered stable, the cases which are considered in the literature as very cumbersome. Finally, we illustrate the procedure on plasma data to identify cases with so-called L-H transition. PMID:26698863
A robust method to forecast volcanic ash clouds

USGS Publications Warehouse

Denlinger, Roger P.; Pavolonis, Mike; Sieglaff, Justin

2012-01-01

Ash clouds emanating from volcanic eruption columns often form trails of ash extending thousands of kilometers through the Earth's atmosphere, disrupting air traffic and posing a significant hazard to air travel. To mitigate such hazards, the community charged with reducing flight risk must accurately assess risk of ash ingestion for any flight path and provide robust forecasts of volcanic ash dispersal. In response to this need, a number of different transport models have been developed for this purpose and applied to recent eruptions, providing a means to assess uncertainty in forecasts. Here we provide a framework for optimal forecasts and their uncertainties given any model and any observational data. This involves random sampling of the probability distributions of input (source) parameters to a transport model and iteratively running the model with different inputs, each time assessing the predictions that the model makes about ash dispersal by direct comparison with satellite data. The results of these comparisons are embodied in a likelihood function whose maximum corresponds to the minimum misfit between model output and observations. Bayes theorem is then used to determine a normalized posterior probability distribution and from that a forecast of future uncertainty in ash dispersal. The nature of ash clouds in heterogeneous wind fields creates a strong maximum likelihood estimate in which most of the probability is localized to narrow ranges of model source parameters. This property is used here to accelerate probability assessment, producing a method to rapidly generate a prediction of future ash concentrations and their distribution based upon assimilation of satellite data as well as model and data uncertainties. Applying this method to the recent eruption of Eyjafjallajökull in Iceland, we show that the 3 and 6 h forecasts of ash cloud location probability encompassed the location of observed satellite-determined ash cloud loads, providing an efficient means to assess all of the hazards associated with these ash clouds.
Star Cluster Properties in Two LEGUS Galaxies Computed with Stochastic Stellar Population Synthesis Models

NASA Astrophysics Data System (ADS)

Krumholz, Mark R.; Adamo, Angela; Fumagalli, Michele; Wofford, Aida; Calzetti, Daniela; Lee, Janice C.; Whitmore, Bradley C.; Bright, Stacey N.; Grasha, Kathryn; Gouliermis, Dimitrios A.; Kim, Hwihyun; Nair, Preethi; Ryon, Jenna E.; Smith, Linda J.; Thilker, David; Ubeda, Leonardo; Zackrisson, Erik

2015-10-01

We investigate a novel Bayesian analysis method, based on the Stochastically Lighting Up Galaxies (slug) code, to derive the masses, ages, and extinctions of star clusters from integrated light photometry. Unlike many analysis methods, slug correctly accounts for incomplete initial mass function (IMF) sampling, and returns full posterior probability distributions rather than simply probability maxima. We apply our technique to 621 visually confirmed clusters in two nearby galaxies, NGC 628 and NGC 7793, that are part of the Legacy Extragalactic UV Survey (LEGUS). LEGUS provides Hubble Space Telescope photometry in the NUV, U, B, V, and I bands. We analyze the sensitivity of the derived cluster properties to choices of prior probability distribution, evolutionary tracks, IMF, metallicity, treatment of nebular emission, and extinction curve. We find that slug's results for individual clusters are insensitive to most of these choices, but that the posterior probability distributions we derive are often quite broad, and sometimes multi-peaked and quite sensitive to the choice of priors. In contrast, the properties of the cluster population as a whole are relatively robust against all of these choices. We also compare our results from slug to those derived with a conventional non-stochastic fitting code, Yggdrasil. We show that slug's stochastic models are generally a better fit to the observations than the deterministic ones used by Yggdrasil. However, the overall properties of the cluster populations recovered by both codes are qualitatively similar.
Statistics of single unit responses in the human medial temporal lobe: A sparse and overdispersed code

NASA Astrophysics Data System (ADS)

Magyar, Andrew

The recent discovery of cells that respond to purely conceptual features of the environment (particular people, landmarks, objects, etc) in the human medial temporal lobe (MTL), has raised many questions about the nature of the neural code in humans. The goal of this dissertation is to develop a novel statistical method based upon maximum likelihood regression which will then be applied to these experiments in order to produce a quantitative description of the coding properties of the human MTL. In general, the method is applicable to any experiments in which a sequence of stimuli are presented to an organism while the binary responses of a large number of cells are recorded in parallel. The central concept underlying the approach is the total probability that a neuron responds to a random stimulus, called the neuronal sparsity. The model then estimates the distribution of response probabilities across the population of cells. Applying the method to single-unit recordings from the human medial temporal lobe, estimates of the sparsity distributions are acquired in four regions: the hippocampus, the entorhinal cortex, the amygdala, and the parahippocampal cortex. The resulting distributions are found to be sparse (large fraction of cells with a low response probability) and highly non-uniform, with a large proportion of ultra-sparse neurons that possess a very low response probability, and a smaller population of cells which respond much more frequently. Rammifications of the results are discussed in relation to the sparse coding hypothesis, and comparisons are made between the statistics of the human medial temporal lobe cells and place cells observed in the rodent hippocampus.
Protein single-model quality assessment by feature-based probability density functions.

PubMed

Cao, Renzhi; Cheng, Jianlin

2016-04-04

Protein quality assessment (QA) has played an important role in protein structure prediction. We developed a novel single-model quality assessment method-Qprob. Qprob calculates the absolute error for each protein feature value against the true quality scores (i.e. GDT-TS scores) of protein structural models, and uses them to estimate its probability density distribution for quality assessment. Qprob has been blindly tested on the 11th Critical Assessment of Techniques for Protein Structure Prediction (CASP11) as MULTICOM-NOVEL server. The official CASP result shows that Qprob ranks as one of the top single-model QA methods. In addition, Qprob makes contributions to our protein tertiary structure predictor MULTICOM, which is officially ranked 3rd out of 143 predictors. The good performance shows that Qprob is good at assessing the quality of models of hard targets. These results demonstrate that this new probability density distribution based method is effective for protein single-model quality assessment and is useful for protein structure prediction. The webserver of Qprob is available at: http://calla.rnet.missouri.edu/qprob/. The software is now freely available in the web server of Qprob.
Latin hypercube approach to estimate uncertainty in ground water vulnerability

USGS Publications Warehouse

Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

2007-01-01

A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Reducing Interpolation Artifacts for Mutual Information Based Image Registration

PubMed Central

Soleimani, H.; Khosravifard, M.A.

2011-01-01

Medical image registration methods which use mutual information as similarity measure have been improved in recent decades. Mutual Information is a basic concept of Information theory which indicates the dependency of two random variables (or two images). In order to evaluate the mutual information of two images their joint probability distribution is required. Several interpolation methods, such as Partial Volume (PV) and bilinear, are used to estimate joint probability distribution. Both of these two methods yield some artifacts on mutual information function. Partial Volume-Hanning window (PVH) and Generalized Partial Volume (GPV) methods are introduced to remove such artifacts. In this paper we show that the acceptable performance of these methods is not due to their kernel function. It's because of the number of pixels which incorporate in interpolation. Since using more pixels requires more complex and time consuming interpolation process, we propose a new interpolation method which uses only four pixels (the same as PV and bilinear interpolations) and removes most of the artifacts. Experimental results of the registration of Computed Tomography (CT) images show superiority of the proposed scheme. PMID:22606673
Probabilistic Structural Analysis Methods (PSAM) for select space propulsion system structural components

NASA Technical Reports Server (NTRS)

Cruse, T. A.

1987-01-01

The objective is the development of several modular structural analysis packages capable of predicting the probabilistic response distribution for key structural variables such as maximum stress, natural frequencies, transient response, etc. The structural analysis packages are to include stochastic modeling of loads, material properties, geometry (tolerances), and boundary conditions. The solution is to be in terms of the cumulative probability of exceedance distribution (CDF) and confidence bounds. Two methods of probability modeling are to be included as well as three types of structural models - probabilistic finite-element method (PFEM); probabilistic approximate analysis methods (PAAM); and probabilistic boundary element methods (PBEM). The purpose in doing probabilistic structural analysis is to provide the designer with a more realistic ability to assess the importance of uncertainty in the response of a high performance structure. Probabilistic Structural Analysis Method (PSAM) tools will estimate structural safety and reliability, while providing the engineer with information on the confidence that should be given to the predicted behavior. Perhaps most critically, the PSAM results will directly provide information on the sensitivity of the design response to those variables which are seen to be uncertain.
Probabilistic Structural Analysis Methods for select space propulsion system structural components (PSAM)

NASA Technical Reports Server (NTRS)

Cruse, T. A.; Burnside, O. H.; Wu, Y.-T.; Polch, E. Z.; Dias, J. B.

1988-01-01

The objective is the development of several modular structural analysis packages capable of predicting the probabilistic response distribution for key structural variables such as maximum stress, natural frequencies, transient response, etc. The structural analysis packages are to include stochastic modeling of loads, material properties, geometry (tolerances), and boundary conditions. The solution is to be in terms of the cumulative probability of exceedance distribution (CDF) and confidence bounds. Two methods of probability modeling are to be included as well as three types of structural models - probabilistic finite-element method (PFEM); probabilistic approximate analysis methods (PAAM); and probabilistic boundary element methods (PBEM). The purpose in doing probabilistic structural analysis is to provide the designer with a more realistic ability to assess the importance of uncertainty in the response of a high performance structure. Probabilistic Structural Analysis Method (PSAM) tools will estimate structural safety and reliability, while providing the engineer with information on the confidence that should be given to the predicted behavior. Perhaps most critically, the PSAM results will directly provide information on the sensitivity of the design response to those variables which are seen to be uncertain.
Assessment of Group Preferences and Group Uncertainty for Decision Making

DTIC Science & Technology

1976-06-01

the individ- uals. decision making , group judgments should be preferred to individual judgments if obtaining group judgments costs more. -26- -YI IV... decision making group . IV. A. 3. Aggregation using conjugate distribution. Arvther procedure for combining indivi(jai probability judgments into a group...statisticized group group decision making group judgment subjective probability Delphi method expected utility nominal group 20. ABSTRACT (Continue on
A statistical method for estimating rates of soil development and ages of geologic deposits: A design for soil-chronosequence studies

USGS Publications Warehouse

Switzer, P.; Harden, J.W.; Mark, R.K.

1988-01-01

A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the most probable age. The statistical variability of the ML-estimated calibration curve is assessed by a Monte Carlo method in which simulated data sets repeatedly are drawn from the distributional specification; calibration parameters are reestimated for each such simulation in order to assess their statistical variability. Several examples are used for illustration. The age of undated soils in a related setting may be estimated from the soil data using the fitted calibration curve. A second simulation to assess age estimate variability is described and applied to the examples. ?? 1988 International Association for Mathematical Geology.
Mean Excess Function as a method of identifying sub-exponential tails: Application to extreme daily rainfall

NASA Astrophysics Data System (ADS)

Nerantzaki, Sofia; Papalexiou, Simon Michael

2017-04-01

Identifying precisely the distribution tail of a geophysical variable is tough, or, even impossible. First, the tail is the part of the distribution for which we have the less empirical information available; second, a universally accepted definition of tail does not and cannot exist; and third, a tail may change over time due to long-term changes. Unfortunately, the tail is the most important part of the distribution as it dictates the estimates of exceedance probabilities or return periods. Fortunately, based on their tail behavior, probability distributions can be generally categorized into two major families, i.e., sub-exponentials (heavy-tailed) and hyper-exponentials (light-tailed). This study aims to update the Mean Excess Function (MEF), providing a useful tool in order to asses which type of tail better describes empirical data. The MEF is based on the mean value of a variable over a threshold and results in a zero slope regression line when applied for the Exponential distribution. Here, we construct slope confidence intervals for the Exponential distribution as functions of sample size. The validation of the method using Monte Carlo techniques on four theoretical distributions covering major tail cases (Pareto type II, Log-normal, Weibull and Gamma) revealed that it performs well especially for large samples. Finally, the method is used to investigate the behavior of daily rainfall extremes; thousands of rainfall records were examined, from all over the world and with sample size over 100 years, revealing that heavy-tailed distributions can describe more accurately rainfall extremes.
Mean, covariance, and effective dimension of stochastic distributed delay dynamics

NASA Astrophysics Data System (ADS)

René, Alexandre; Longtin, André

2017-11-01

Dynamical models are often required to incorporate both delays and noise. However, the inherently infinite-dimensional nature of delay equations makes formal solutions to stochastic delay differential equations (SDDEs) challenging. Here, we present an approach, similar in spirit to the analysis of functional differential equations, but based on finite-dimensional matrix operators. This results in a method for obtaining both transient and stationary solutions that is directly amenable to computation, and applicable to first order differential systems with either discrete or distributed delays. With fewer assumptions on the system's parameters than other current solution methods and no need to be near a bifurcation, we decompose the solution to a linear SDDE with arbitrary distributed delays into natural modes, in effect the eigenfunctions of the differential operator, and show that relatively few modes can suffice to approximate the probability density of solutions. Thus, we are led to conclude that noise makes these SDDEs effectively low dimensional, which opens the possibility of practical definitions of probability densities over their solution space.

Target Tracking Using SePDAF under Ambiguous Angles for Distributed Array Radar.

PubMed

Long, Teng; Zhang, Honggang; Zeng, Tao; Chen, Xinliang; Liu, Quanhua; Zheng, Le

2016-09-09

Distributed array radar can improve radar detection capability and measurement accuracy. However, it will suffer cyclic ambiguity in its angle estimates according to the spatial Nyquist sampling theorem since the large sparse array is undersampling. Consequently, the state estimation accuracy and track validity probability degrades when the ambiguous angles are directly used for target tracking. This paper proposes a second probability data association filter (SePDAF)-based tracking method for distributed array radar. Firstly, the target motion model and radar measurement model is built. Secondly, the fusion result of each radar's estimation is employed to the extended Kalman filter (EKF) to finish the first filtering. Thirdly, taking this result as prior knowledge, and associating with the array-processed ambiguous angles, the SePDAF is applied to accomplish the second filtering, and then achieving a high accuracy and stable trajectory with relatively low computational complexity. Moreover, the azimuth filtering accuracy will be promoted dramatically and the position filtering accuracy will also improve. Finally, simulations illustrate the effectiveness of the proposed method.
Quantifying Safety Margin Using the Risk-Informed Safety Margin Characterization (RISMC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grabaskas, David; Bucknor, Matthew; Brunett, Acacia

2015-04-26

The Risk-Informed Safety Margin Characterization (RISMC), developed by Idaho National Laboratory as part of the Light-Water Reactor Sustainability Project, utilizes a probabilistic safety margin comparison between a load and capacity distribution, rather than a deterministic comparison between two values, as is usually done in best-estimate plus uncertainty analyses. The goal is to determine the failure probability, or in other words, the probability of the system load equaling or exceeding the system capacity. While this method has been used in pilot studies, there has been little work conducted investigating the statistical significance of the resulting failure probability. In particular, it ismore » difficult to determine how many simulations are necessary to properly characterize the failure probability. This work uses classical (frequentist) statistics and confidence intervals to examine the impact in statistical accuracy when the number of simulations is varied. Two methods are proposed to establish confidence intervals related to the failure probability established using a RISMC analysis. The confidence interval provides information about the statistical accuracy of the method utilized to explore the uncertainty space, and offers a quantitative method to gauge the increase in statistical accuracy due to performing additional simulations.« less
Powerlaw: a Python package for analysis of heavy-tailed distributions.

PubMed

Alstott, Jeff; Bullmore, Ed; Plenz, Dietmar

2014-01-01

Power laws are theoretically interesting probability distributions that are also frequently used to describe empirical data. In recent years, effective statistical methods for fitting power laws have been developed, but appropriate use of these techniques requires significant programming and statistical insight. In order to greatly decrease the barriers to using good statistical methods for fitting power law distributions, we developed the powerlaw Python package. This software package provides easy commands for basic fitting and statistical analysis of distributions. Notably, it also seeks to support a variety of user needs by being exhaustive in the options available to the user. The source code is publicly available and easily extensible.
Measurement of 240Pu Angular Momentum Dependent Fission Probabilities Using the (α ,α') Reaction

NASA Astrophysics Data System (ADS)

Koglin, Johnathon; Burke, Jason; Fisher, Scott; Jovanovic, Igor

2017-09-01

The surrogate reaction method often lacks the theoretical framework and necessary experimental data to constrain models especially when rectifying differences between angular momentum state differences between the desired and surrogate reaction. In this work, dual arrays of silicon telescope particle identification detectors and photovoltaic (solar) cell fission fragment detectors have been used to measure the fission probability of the 240Pu(α ,α' f) reaction - a surrogate for the 239Pu(n , f) - and fission fragment angular distributions. Fission probability measurements were performed at a beam energy of 35.9(2) MeV at eleven scattering angles from 40° to 140°e in 10° intervals and at nuclear excitation energies up to 16 MeV. Fission fragment angular distributions were measured in six bins from 4.5 MeV to 8.0 MeV and fit to expected distributions dependent on the vibrational and rotational excitations at the saddle point. In this way, the contributions to the total fission probability from specific states of K angular momentum projection on the symmetry axis are extracted. A sizable data collection is presented to be considered when constraining microscopic cross section calculations.
Predicting the cosmological constant with the scale-factor cutoff measure

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Simone, Andrea; Guth, Alan H.; Salem, Michael P.

2008-09-15

It is well known that anthropic selection from a landscape with a flat prior distribution of cosmological constant {lambda} gives a reasonable fit to observation. However, a realistic model of the multiverse has a physical volume that diverges with time, and the predicted distribution of {lambda} depends on how the spacetime volume is regulated. A very promising method of regulation uses a scale-factor cutoff, which avoids a number of serious problems that arise in other approaches. In particular, the scale-factor cutoff avoids the 'youngness problem' (high probability of living in a much younger universe) and the 'Q and G catastrophes'more » (high probability for the primordial density contrast Q and gravitational constant G to have extremely large or small values). We apply the scale-factor cutoff measure to the probability distribution of {lambda}, considering both positive and negative values. The results are in good agreement with observation. In particular, the scale-factor cutoff strongly suppresses the probability for values of {lambda} that are more than about 10 times the observed value. We also discuss qualitatively the prediction for the density parameter {omega}, indicating that with this measure there is a possibility of detectable negative curvature.« less
A Bayesian approach to modeling 2D gravity data using polygon states

NASA Astrophysics Data System (ADS)

Titus, W. J.; Titus, S.; Davis, J. R.

2015-12-01

We present a Bayesian Markov chain Monte Carlo (MCMC) method for the 2D gravity inversion of a localized subsurface object with constant density contrast. Our models have four parameters: the density contrast, the number of vertices in a polygonal approximation of the object, an upper bound on the ratio of the perimeter squared to the area, and the vertices of a polygon container that bounds the object. Reasonable parameter values can be estimated prior to inversion using a forward model and geologic information. In addition, we assume that the field data have a common random uncertainty that lies between two bounds but that it has no systematic uncertainty. Finally, we assume that there is no uncertainty in the spatial locations of the measurement stations. For any set of model parameters, we use MCMC methods to generate an approximate probability distribution of polygons for the object. We then compute various probability distributions for the object, including the variance between the observed and predicted fields (an important quantity in the MCMC method), the area, the center of area, and the occupancy probability (the probability that a spatial point lies within the object). In addition, we compare probabilities of different models using parallel tempering, a technique which also mitigates trapping in local optima that can occur in certain model geometries. We apply our method to several synthetic data sets generated from objects of varying shape and location. We also analyze a natural data set collected across the Rio Grande Gorge Bridge in New Mexico, where the object (i.e. the air below the bridge) is known and the canyon is approximately 2D. Although there are many ways to view results, the occupancy probability proves quite powerful. We also find that the choice of the container is important. In particular, large containers should be avoided, because the more closely a container confines the object, the better the predictions match properties of object.
Probability Density Functions of Observed Rainfall in Montana

NASA Technical Reports Server (NTRS)

Larsen, Scott D.; Johnson, L. Ronald; Smith, Paul L.

1995-01-01

The question of whether a rain rate probability density function (PDF) can vary uniformly between precipitation events is examined. Image analysis on large samples of radar echoes is possible because of advances in technology. The data provided by such an analysis easily allow development of radar reflectivity factors (and by extension rain rate) distribution. Finding a PDF becomes a matter of finding a function that describes the curve approximating the resulting distributions. Ideally, one PDF would exist for all cases; or many PDF's that have the same functional form with only systematic variations in parameters (such as size or shape) exist. Satisfying either of theses cases will, validate the theoretical basis of the Area Time Integral (ATI). Using the method of moments and Elderton's curve selection criteria, the Pearson Type 1 equation was identified as a potential fit for 89 percent of the observed distributions. Further analysis indicates that the Type 1 curve does approximate the shape of the distributions but quantitatively does not produce a great fit. Using the method of moments and Elderton's curve selection criteria, the Pearson Type 1 equation was identified as a potential fit for 89% of the observed distributions. Further analysis indicates that the Type 1 curve does approximate the shape of the distributions but quantitatively does not produce a great fit.
Normal probability plots with confidence.

PubMed

Chantarangsi, Wanpen; Liu, Wei; Bretz, Frank; Kiatsupaibul, Seksan; Hayter, Anthony J; Wan, Fang

2015-01-01

Normal probability plots are widely used as a statistical tool for assessing whether an observed simple random sample is drawn from a normally distributed population. The users, however, have to judge subjectively, if no objective rule is provided, whether the plotted points fall close to a straight line. In this paper, we focus on how a normal probability plot can be augmented by intervals for all the points so that, if the population distribution is normal, then all the points should fall into the corresponding intervals simultaneously with probability 1-α. These simultaneous 1-α probability intervals provide therefore an objective mean to judge whether the plotted points fall close to the straight line: the plotted points fall close to the straight line if and only if all the points fall into the corresponding intervals. The powers of several normal probability plot based (graphical) tests and the most popular nongraphical Anderson-Darling and Shapiro-Wilk tests are compared by simulation. Based on this comparison, recommendations are given in Section 3 on which graphical tests should be used in what circumstances. An example is provided to illustrate the methods. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Integrated seismic stochastic inversion and multi-attributes to delineate reservoir distribution: Case study MZ fields, Central Sumatra Basin

NASA Astrophysics Data System (ADS)

Haris, A.; Novriyani, M.; Suparno, S.; Hidayat, R.; Riyanto, A.

2017-07-01

This study presents the integration of seismic stochastic inversion and multi-attributes for delineating the reservoir distribution in term of lithology and porosity in the formation within depth interval between the Top Sihapas and Top Pematang. The method that has been used is a stochastic inversion, which is integrated with multi-attribute seismic by applying neural network Probabilistic Neural Network (PNN). Stochastic methods are used to predict the probability mapping sandstone as the result of impedance varied with 50 realizations that will produce a good probability. Analysis of Stochastic Seismic Tnversion provides more interpretive because it directly gives the value of the property. Our experiment shows that AT of stochastic inversion provides more diverse uncertainty so that the probability value will be close to the actual values. The produced AT is then used for an input of a multi-attribute analysis, which is used to predict the gamma ray, density and porosity logs. To obtain the number of attributes that are used, stepwise regression algorithm is applied. The results are attributes which are used in the process of PNN. This PNN method is chosen because it has the best correlation of others neural network method. Finally, we interpret the product of the multi-attribute analysis are in the form of pseudo-gamma ray volume, density volume and volume of pseudo-porosity to delineate the reservoir distribution. Our interpretation shows that the structural trap is identified in the southeastern part of study area, which is along the anticline.
Assignment of functional activations to probabilistic cytoarchitectonic areas revisited.

PubMed

Eickhoff, Simon B; Paus, Tomas; Caspers, Svenja; Grosbras, Marie-Helene; Evans, Alan C; Zilles, Karl; Amunts, Katrin

2007-07-01

Probabilistic cytoarchitectonic maps in standard reference space provide a powerful tool for the analysis of structure-function relationships in the human brain. While these microstructurally defined maps have already been successfully used in the analysis of somatosensory, motor or language functions, several conceptual issues in the analysis of structure-function relationships still demand further clarification. In this paper, we demonstrate the principle approaches for anatomical localisation of functional activations based on probabilistic cytoarchitectonic maps by exemplary analysis of an anterior parietal activation evoked by visual presentation of hand gestures. After consideration of the conceptual basis and implementation of volume or local maxima labelling, we comment on some potential interpretational difficulties, limitations and caveats that could be encountered. Extending and supplementing these methods, we then propose a supplementary approach for quantification of structure-function correspondences based on distribution analysis. This approach relates the cytoarchitectonic probabilities observed at a particular functionally defined location to the areal specific null distribution of probabilities across the whole brain (i.e., the full probability map). Importantly, this method avoids the need for a unique classification of voxels to a single cortical area and may increase the comparability between results obtained for different areas. Moreover, as distribution-based labelling quantifies the "central tendency" of an activation with respect to anatomical areas, it will, in combination with the established methods, allow an advanced characterisation of the anatomical substrates of functional activations. Finally, the advantages and disadvantages of the various methods are discussed, focussing on the question of which approach is most appropriate for a particular situation.
Weighing Clinical Evidence Using Patient Preferences: An Application of Probabilistic Multi-Criteria Decision Analysis.

PubMed

Broekhuizen, Henk; IJzerman, Maarten J; Hauber, A Brett; Groothuis-Oudshoorn, Catharina G M

2017-03-01

The need for patient engagement has been recognized by regulatory agencies, but there is no consensus about how to operationalize this. One approach is the formal elicitation and use of patient preferences for weighing clinical outcomes. The aim of this study was to demonstrate how patient preferences can be used to weigh clinical outcomes when both preferences and clinical outcomes are uncertain by applying a probabilistic value-based multi-criteria decision analysis (MCDA) method. Probability distributions were used to model random variation and parameter uncertainty in preferences, and parameter uncertainty in clinical outcomes. The posterior value distributions and rank probabilities for each treatment were obtained using Monte-Carlo simulations. The probability of achieving the first rank is the probability that a treatment represents the highest value to patients. We illustrated our methodology for a simplified case on six HIV treatments. Preferences were modeled with normal distributions and clinical outcomes were modeled with beta distributions. The treatment value distributions showed the rank order of treatments according to patients and illustrate the remaining decision uncertainty. This study demonstrated how patient preference data can be used to weigh clinical evidence using MCDA. The model takes into account uncertainty in preferences and clinical outcomes. The model can support decision makers during the aggregation step of the MCDA process and provides a first step toward preference-based personalized medicine, yet requires further testing regarding its appropriate use in real-world settings.
A computational framework to empower probabilistic protein design

PubMed Central

Fromer, Menachem; Yanover, Chen

2008-01-01

Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: fromer@cs.huji.ac.il PMID:18586717
VizieR Online Data Catalog: Proper motions of PM2000 open clusters (Krone-Martins+, 2010)

NASA Astrophysics Data System (ADS)

Krone-Martins, A.; Soubiran, C.; Ducourant, C.; Teixeira, R.; Le Campion, J. F.

2010-04-01

We present lists of proper-motions and kinematic membership probabilities in the region of 49 open clusters or possible open clusters. The stellar proper motions were taken from the Bordeaux PM2000 catalogue. The segregation between cluster and field stars and the assignment of membership probabilities was accomplished by applying a fully automated method based on parametrisations for the probability distribution functions and genetic algorithm optimisation heuristics associated with a derivative-based hill climbing algorithm for the likelihood optimization. (3 data files).
Probability elicitation to inform early health economic evaluations of new medical technologies: a case study in heart failure disease management.

PubMed

Cao, Qi; Postmus, Douwe; Hillege, Hans L; Buskens, Erik

2013-06-01

Early estimates of the commercial headroom available to a new medical device can assist producers of health technology in making appropriate product investment decisions. The purpose of this study was to illustrate how this quantity can be captured probabilistically by combining probability elicitation with early health economic modeling. The technology considered was a novel point-of-care testing device in heart failure disease management. First, we developed a continuous-time Markov model to represent the patients' disease progression under the current care setting. Next, we identified the model parameters that are likely to change after the introduction of the new device and interviewed three cardiologists to capture the probability distributions of these parameters. Finally, we obtained the probability distribution of the commercial headroom available per measurement by propagating the uncertainty in the model inputs to uncertainty in modeled outcomes. For a willingness-to-pay value of €10,000 per life-year, the median headroom available per measurement was €1.64 (interquartile range €0.05-€3.16) when the measurement frequency was assumed to be daily. In the subsequently conducted sensitivity analysis, this median value increased to a maximum of €57.70 for different combinations of the willingness-to-pay threshold and the measurement frequency. Probability elicitation can successfully be combined with early health economic modeling to obtain the probability distribution of the headroom available to a new medical technology. Subsequently feeding this distribution into a product investment evaluation method enables stakeholders to make more informed decisions regarding to which markets a currently available product prototype should be targeted. Copyright © 2013. Published by Elsevier Inc.
Heterogeneous Data Fusion Method to Estimate Travel Time Distributions in Congested Road Networks

PubMed Central

Lam, William H. K.; Li, Qingquan

2017-01-01

Travel times in congested urban road networks are highly stochastic. Provision of travel time distribution information, including both mean and variance, can be very useful for travelers to make reliable path choice decisions to ensure higher probability of on-time arrival. To this end, a heterogeneous data fusion method is proposed to estimate travel time distributions by fusing heterogeneous data from point and interval detectors. In the proposed method, link travel time distributions are first estimated from point detector observations. The travel time distributions of links without point detectors are imputed based on their spatial correlations with links that have point detectors. The estimated link travel time distributions are then fused with path travel time distributions obtained from the interval detectors using Dempster-Shafer evidence theory. Based on fused path travel time distribution, an optimization technique is further introduced to update link travel time distributions and their spatial correlations. A case study was performed using real-world data from Hong Kong and showed that the proposed method obtained accurate and robust estimations of link and path travel time distributions in congested road networks. PMID:29210978
Heterogeneous Data Fusion Method to Estimate Travel Time Distributions in Congested Road Networks.

PubMed

Shi, Chaoyang; Chen, Bi Yu; Lam, William H K; Li, Qingquan

2017-12-06

Travel times in congested urban road networks are highly stochastic. Provision of travel time distribution information, including both mean and variance, can be very useful for travelers to make reliable path choice decisions to ensure higher probability of on-time arrival. To this end, a heterogeneous data fusion method is proposed to estimate travel time distributions by fusing heterogeneous data from point and interval detectors. In the proposed method, link travel time distributions are first estimated from point detector observations. The travel time distributions of links without point detectors are imputed based on their spatial correlations with links that have point detectors. The estimated link travel time distributions are then fused with path travel time distributions obtained from the interval detectors using Dempster-Shafer evidence theory. Based on fused path travel time distribution, an optimization technique is further introduced to update link travel time distributions and their spatial correlations. A case study was performed using real-world data from Hong Kong and showed that the proposed method obtained accurate and robust estimations of link and path travel time distributions in congested road networks.
Sample size guidelines for fitting a lognormal probability distribution to censored most probable number data with a Markov chain Monte Carlo method.

PubMed

Williams, Michael S; Cao, Yong; Ebel, Eric D

2013-07-15

Levels of pathogenic organisms in food and water have steadily declined in many parts of the world. A consequence of this reduction is that the proportion of samples that test positive for the most contaminated product-pathogen pairings has fallen to less than 0.1. While this is unequivocally beneficial to public health, datasets with very few enumerated samples present an analytical challenge because a large proportion of the observations are censored values. One application of particular interest to risk assessors is the fitting of a statistical distribution function to datasets collected at some point in the farm-to-table continuum. The fitted distribution forms an important component of an exposure assessment. A number of studies have compared different fitting methods and proposed lower limits on the proportion of samples where the organisms of interest are identified and enumerated, with the recommended lower limit of enumerated samples being 0.2. This recommendation may not be applicable to food safety risk assessments for a number of reasons, which include the development of new Bayesian fitting methods, the use of highly sensitive screening tests, and the generally larger sample sizes found in surveys of food commodities. This study evaluates the performance of a Markov chain Monte Carlo fitting method when used in conjunction with a screening test and enumeration of positive samples by the Most Probable Number technique. The results suggest that levels of contamination for common product-pathogen pairs, such as Salmonella on poultry carcasses, can be reliably estimated with the proposed fitting method and samples sizes in excess of 500 observations. The results do, however, demonstrate that simple guidelines for this application, such as the proportion of positive samples, cannot be provided. Published by Elsevier B.V.
A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.

PubMed

Gao, Xiang; Lin, Huaiying; Dong, Qunfeng

2017-01-01

Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.
A risk-based multi-objective model for optimal placement of sensors in water distribution system

NASA Astrophysics Data System (ADS)

Naserizade, Sareh S.; Nikoo, Mohammad Reza; Montaseri, Hossein

2018-02-01

In this study, a new stochastic model based on Conditional Value at Risk (CVaR) and multi-objective optimization methods is developed for optimal placement of sensors in water distribution system (WDS). This model determines minimization of risk which is caused by simultaneous multi-point contamination injection in WDS using CVaR approach. The CVaR considers uncertainties of contamination injection in the form of probability distribution function and calculates low-probability extreme events. In this approach, extreme losses occur at tail of the losses distribution function. Four-objective optimization model based on NSGA-II algorithm is developed to minimize losses of contamination injection (through CVaR of affected population and detection time) and also minimize the two other main criteria of optimal placement of sensors including probability of undetected events and cost. Finally, to determine the best solution, Preference Ranking Organization METHod for Enrichment Evaluation (PROMETHEE), as a subgroup of Multi Criteria Decision Making (MCDM) approach, is utilized to rank the alternatives on the trade-off curve among objective functions. Also, sensitivity analysis is done to investigate the importance of each criterion on PROMETHEE results considering three relative weighting scenarios. The effectiveness of the proposed methodology is examined through applying it to Lamerd WDS in the southwestern part of Iran. The PROMETHEE suggests 6 sensors with suitable distribution that approximately cover all regions of WDS. Optimal values related to CVaR of affected population and detection time as well as probability of undetected events for the best optimal solution are equal to 17,055 persons, 31 mins and 0.045%, respectively. The obtained results of the proposed methodology in Lamerd WDS show applicability of CVaR-based multi-objective simulation-optimization model for incorporating the main uncertainties of contamination injection in order to evaluate extreme value of losses in WDS.
Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics

PubMed Central

Dybowski, Richard; McKinley, Trevelyan J.; Mastroeni, Pietro; Restif, Olivier

2013-01-01

Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered. PMID:24376528

Identifying early-warning signals of critical transitions with strong noise by dynamical network markers

PubMed Central

Liu, Rui; Chen, Pei; Aihara, Kazuyuki; Chen, Luonan

2015-01-01

Identifying early-warning signals of a critical transition for a complex system is difficult, especially when the target system is constantly perturbed by big noise, which makes the traditional methods fail due to the strong fluctuations of the observed data. In this work, we show that the critical transition is not traditional state-transition but probability distribution-transition when the noise is not sufficiently small, which, however, is a ubiquitous case in real systems. We present a model-free computational method to detect the warning signals before such transitions. The key idea behind is a strategy: “making big noise smaller” by a distribution-embedding scheme, which transforms the data from the observed state-variables with big noise to their distribution-variables with small noise, and thus makes the traditional criteria effective because of the significantly reduced fluctuations. Specifically, increasing the dimension of the observed data by moment expansion that changes the system from state-dynamics to probability distribution-dynamics, we derive new data in a higher-dimensional space but with much smaller noise. Then, we develop a criterion based on the dynamical network marker (DNM) to signal the impending critical transition using the transformed higher-dimensional data. We also demonstrate the effectiveness of our method in biological, ecological and financial systems. PMID:26647650
Probability bounds analysis for nonlinear population ecology models.

PubMed

Enszer, Joshua A; Andrei Măceș, D; Stadtherr, Mark A

2015-09-01

Mathematical models in population ecology often involve parameters that are empirically determined and inherently uncertain, with probability distributions for the uncertainties not known precisely. Propagating such imprecise uncertainties rigorously through a model to determine their effect on model outputs can be a challenging problem. We illustrate here a method for the direct propagation of uncertainties represented by probability bounds though nonlinear, continuous-time, dynamic models in population ecology. This makes it possible to determine rigorous bounds on the probability that some specified outcome for a population is achieved, which can be a core problem in ecosystem modeling for risk assessment and management. Results can be obtained at a computational cost that is considerably less than that required by statistical sampling methods such as Monte Carlo analysis. The method is demonstrated using three example systems, with focus on a model of an experimental aquatic food web subject to the effects of contamination by ionic liquids, a new class of potentially important industrial chemicals. Copyright © 2015. Published by Elsevier Inc.
Bayesian Probability Theory

NASA Astrophysics Data System (ADS)

von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo

2014-06-01

Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Stochastic models for the Trojan Y-Chromosome eradication strategy of an invasive species.

PubMed

Wang, Xueying; Walton, Jay R; Parshad, Rana D

2016-01-01

The Trojan Y-Chromosome (TYC) strategy, an autocidal genetic biocontrol method, has been proposed to eliminate invasive alien species. In this work, we develop a Markov jump process model for this strategy, and we verify that there is a positive probability for wild-type females going extinct within a finite time. Moreover, when sex-reversed Trojan females are introduced at a constant population size, we formulate a stochastic differential equation (SDE) model as an approximation to the proposed Markov jump process model. Using the SDE model, we investigate the probability distribution and expectation of the extinction time of wild-type females by solving Kolmogorov equations associated with these statistics. The results indicate how the probability distribution and expectation of the extinction time are shaped by the initial conditions and the model parameters.
Representation of complex probabilities and complex Gibbs sampling

NASA Astrophysics Data System (ADS)

Salcedo, Lorenzo Luis

2018-03-01

Complex weights appear in Physics which are beyond a straightforward importance sampling treatment, as required in Monte Carlo calculations. This is the wellknown sign problem. The complex Langevin approach amounts to effectively construct a positive distribution on the complexified manifold reproducing the expectation values of the observables through their analytical extension. Here we discuss the direct construction of such positive distributions paying attention to their localization on the complexified manifold. Explicit localized representations are obtained for complex probabilities defined on Abelian and non Abelian groups. The viability and performance of a complex version of the heat bath method, based on such representations, is analyzed.
An analytical approach to gravitational lensing by an ensemble of axisymmetric lenses

NASA Technical Reports Server (NTRS)

Lee, Man Hoi; Spergel, David N.

1990-01-01

The problem of gravitational lensing by an ensemble of identical axisymmetric lenses randomly distributed on a single lens plane is considered and a formal expression is derived for the joint probability density of finding shear and convergence at a random point on the plane. The amplification probability for a source can be accurately estimated from the distribution in shear and convergence. This method is applied to two cases: lensing by an ensemble of point masses and by an ensemble of objects with Gaussian surface mass density. There is no convergence for point masses whereas shear is negligible for wide Gaussian lenses.
Combining Probability Distributions of Wind Waves and Sea Level Variations to Assess Return Periods of Coastal Floods

NASA Astrophysics Data System (ADS)

Leijala, U.; Bjorkqvist, J. V.; Pellikka, H.; Johansson, M. M.; Kahma, K. K.

2017-12-01

Predicting the behaviour of the joint effect of sea level and wind waves is of great significance due to the major impact of flooding events in densely populated coastal regions. As mean sea level rises, the effect of sea level variations accompanied by the waves will be even more harmful in the future. The main challenge when evaluating the effect of waves and sea level variations is that long time series of both variables rarely exist. Wave statistics are also highly location-dependent, thus requiring wave buoy measurements and/or high-resolution wave modelling. As an initial approximation of the joint effect, the variables may be treated as independent random variables, to achieve the probability distribution of their sum. We present results of a case study based on three probability distributions: 1) wave run-up constructed from individual wave buoy measurements, 2) short-term sea level variability based on tide gauge data, and 3) mean sea level projections based on up-to-date regional scenarios. The wave measurements were conducted during 2012-2014 on the coast of city of Helsinki located in the Gulf of Finland in the Baltic Sea. The short-term sea level distribution contains the last 30 years (1986-2015) of hourly data from Helsinki tide gauge, and the mean sea level projections are scenarios adjusted for the Gulf of Finland. Additionally, we present a sensitivity test based on six different theoretical wave height distributions representing different wave behaviour in relation to sea level variations. As these wave distributions are merged with one common sea level distribution, we can study how the different shapes of the wave height distribution affect the distribution of the sum, and which one of the components is dominating under different wave conditions. As an outcome of the method, we obtain a probability distribution of the maximum elevation of the continuous water mass, which enables a flexible tool for evaluating different risk levels in the current and future climate.
Paleodemographic age-at-death distributions of two Mexican skeletal collections: a comparison of transition analysis and traditional aging methods.

PubMed

Bullock, Meggan; Márquez, Lourdes; Hernández, Patricia; Ruíz, Fernando

2013-09-01

Traditional methods of aging adult skeletons suffer from the problem of age mimicry of the reference collection, as described by Bocquet-Appel and Masset (1982). Transition analysis (Boldsen et al., 2002) is a method of aging adult skeletons that addresses the problem of age mimicry of the reference collection by allowing users to select an appropriate prior probability. In order to evaluate whether transition analysis results in significantly different age estimates for adults, the method was applied to skeletal collections from Postclassic Cholula and Contact-Period Xochimilco. The resulting age-at-death distributions were then compared with age-at-death distributions for the two populations constructed using traditional aging methods. Although the traditional aging methods result in age-at-death distributions with high young adult mortality and few individuals living past the age of 50, the age-at-death distributions constructed using transition analysis indicate that most individuals who lived into adulthood lived past the age of 50. Copyright © 2013 Wiley Periodicals, Inc.
Estimation and applications of size-based distributions in forestry

Treesearch

Jeffrey H. Gove

2003-01-01

Size-based distributions arise in several contexts in forestry and ecology. Simple power relationships (e.g., basal area and diameter at breast height) between variables are one such area of interest arising from a modeling perspective. Another, probability proportional to size sampline (PPS), is found in the most widely used methods for sampling standing or dead and...
Estimation and applications of size-biased distributions in forestry

Treesearch

Jeffrey H. Gove

2003-01-01

Size-biased distributions arise naturally in several contexts in forestry and ecology. Simple power relationships (e.g. basal area and diameter at breast height) between variables are one such area of interest arising from a modelling perspective. Another, probability proportional to size PPS) sampling, is found in the most widely used methods for sampling standing or...
A method to compute SEU fault probabilities in memory arrays with error correction

NASA Technical Reports Server (NTRS)

Gercek, Gokhan

1994-01-01

With the increasing packing densities in VLSI technology, Single Event Upsets (SEU) due to cosmic radiations are becoming more of a critical issue in the design of space avionics systems. In this paper, a method is introduced to compute the fault (mishap) probability for a computer memory of size M words. It is assumed that a Hamming code is used for each word to provide single error correction. It is also assumed that every time a memory location is read, single errors are corrected. Memory is read randomly whose distribution is assumed to be known. In such a scenario, a mishap is defined as two SEU's corrupting the same memory location prior to a read. The paper introduces a method to compute the overall mishap probability for the entire memory for a mission duration of T hours.
Evaluation of carotid plaque echogenicity based on the integral of the cumulative probability distribution using gray-scale ultrasound images.

PubMed

Huang, Xiaowei; Zhang, Yanling; Meng, Long; Abbott, Derek; Qian, Ming; Wong, Kelvin K L; Zheng, Rongqing; Zheng, Hairong; Niu, Lili

2017-01-01

Carotid plaque echogenicity is associated with the risk of cardiovascular events. Gray-scale median (GSM) of the ultrasound image of carotid plaques has been widely used as an objective method for evaluation of plaque echogenicity in patients with atherosclerosis. We proposed a computer-aided method to evaluate plaque echogenicity and compared its efficiency with GSM. One hundred and twenty-five carotid plaques (43 echo-rich, 35 intermediate, 47 echolucent) were collected from 72 patients in this study. The cumulative probability distribution curves were obtained based on statistics of the pixels in the gray-level images of plaques. The area under the cumulative probability distribution curve (AUCPDC) was calculated as its integral value to evaluate plaque echogenicity. The classification accuracy for three types of plaques is 78.4% (kappa value, κ = 0.673), when the AUCPDC is used for classifier training, whereas GSM is 64.8% (κ = 0.460). The receiver operating characteristic curves were produced to test the effectiveness of AUCPDC and GSM for the identification of echolucent plaques. The area under the curve (AUC) was 0.817 when AUCPDC was used for training the classifier, which is higher than that achieved using GSM (AUC = 0.746). Compared with GSM, the AUCPDC showed a borderline association with coronary heart disease (Spearman r = 0.234, p = 0.050). Our experimental results suggest that AUCPDC analysis is a promising method for evaluation of plaque echogenicity and predicting cardiovascular events in patients with plaques.
Monte Carlo Method for Determining Earthquake Recurrence Parameters from Short Paleoseismic Catalogs: Example Calculations for California

USGS Publications Warehouse

Parsons, Tom

2008-01-01

Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques [e.g., Ellsworth et al., 1999]. In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means [e.g., NIST/SEMATECH, 2006]. For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDF?s, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.
Monte Carlo method for determining earthquake recurrence parameters from short paleoseismic catalogs: Example calculations for California

USGS Publications Warehouse

Parsons, T.

2008-01-01

Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques (e.g., Ellsworth et al., 1999). In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means (e.g., NIST/SEMATECH, 2006). For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDFs, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.
Mixture distributions of wind speed in the UAE

NASA Astrophysics Data System (ADS)

Shin, J.; Ouarda, T.; Lee, T. S.

2013-12-01

Wind speed probability distribution is commonly used to estimate potential wind energy. The 2-parameter Weibull distribution has been most widely used to characterize the distribution of wind speed. However, it is unable to properly model wind speed regimes when wind speed distribution presents bimodal and kurtotic shapes. Several studies have concluded that the Weibull distribution should not be used for frequency analysis of wind speed without investigation of wind speed distribution. Due to these mixture distributional characteristics of wind speed data, the application of mixture distributions should be further investigated in the frequency analysis of wind speed. A number of studies have investigated the potential wind energy in different parts of the Arabian Peninsula. Mixture distributional characteristics of wind speed were detected from some of these studies. Nevertheless, mixture distributions have not been employed for wind speed modeling in the Arabian Peninsula. In order to improve our understanding of wind energy potential in Arabian Peninsula, mixture distributions should be tested for the frequency analysis of wind speed. The aim of the current study is to assess the suitability of mixture distributions for the frequency analysis of wind speed in the UAE. Hourly mean wind speed data at 10-m height from 7 stations were used in the current study. The Weibull and Kappa distributions were employed as representatives of the conventional non-mixture distributions. 10 mixture distributions are used and constructed by mixing four probability distributions such as Normal, Gamma, Weibull and Extreme value type-one (EV-1) distributions. Three parameter estimation methods such as Expectation Maximization algorithm, Least Squares method and Meta-Heuristic Maximum Likelihood (MHML) method were employed to estimate the parameters of the mixture distributions. In order to compare the goodness-of-fit of tested distributions and parameter estimation methods for sample wind data, the adjusted coefficient of determination, Bayesian Information Criterion (BIC) and Chi-squared statistics were computed. Results indicate that MHML presents the best performance of parameter estimation for the used mixture distributions. In most of the employed 7 stations, mixture distributions give the best fit. When the wind speed regime shows mixture distributional characteristics, most of these regimes present the kurtotic statistical characteristic. Particularly, applications of mixture distributions for these stations show a significant improvement in explaining the whole wind speed regime. In addition, the Weibull-Weibull mixture distribution presents the best fit for the wind speed data in the UAE.
Probability evolution method for exit location distribution

NASA Astrophysics Data System (ADS)

Zhu, Jinjie; Chen, Zhen; Liu, Xianbin

2018-03-01

The exit problem in the framework of the large deviation theory has been a hot topic in the past few decades. The most probable escape path in the weak-noise limit has been clarified by the Freidlin-Wentzell action functional. However, noise in real physical systems cannot be arbitrarily small while noise with finite strength may induce nontrivial phenomena, such as noise-induced shift and noise-induced saddle-point avoidance. Traditional Monte Carlo simulation of noise-induced escape will take exponentially large time as noise approaches zero. The majority of the time is wasted on the uninteresting wandering around the attractors. In this paper, a new method is proposed to decrease the escape simulation time by an exponentially large factor by introducing a series of interfaces and by applying the reinjection on them. This method can be used to calculate the exit location distribution. It is verified by examining two classical examples and is compared with theoretical predictions. The results show that the method performs well for weak noise while may induce certain deviations for large noise. Finally, some possible ways to improve our method are discussed.
Application of Monte Carlo Method for Evaluation of Uncertainties of ITS-90 by Standard Platinum Resistance Thermometer

NASA Astrophysics Data System (ADS)

Palenčár, Rudolf; Sopkuliak, Peter; Palenčár, Jakub; Ďuriš, Stanislav; Suroviak, Emil; Halaj, Martin

2017-06-01

Evaluation of uncertainties of the temperature measurement by standard platinum resistance thermometer calibrated at the defining fixed points according to ITS-90 is a problem that can be solved in different ways. The paper presents a procedure based on the propagation of distributions using the Monte Carlo method. The procedure employs generation of pseudo-random numbers for the input variables of resistances at the defining fixed points, supposing the multivariate Gaussian distribution for input quantities. This allows taking into account the correlations among resistances at the defining fixed points. Assumption of Gaussian probability density function is acceptable, with respect to the several sources of uncertainties of resistances. In the case of uncorrelated resistances at the defining fixed points, the method is applicable to any probability density function. Validation of the law of propagation of uncertainty using the Monte Carlo method is presented on the example of specific data for 25 Ω standard platinum resistance thermometer in the temperature range from 0 to 660 °C. Using this example, we demonstrate suitability of the method by validation of its results.
Characterization of Cloud Water-Content Distribution

NASA Technical Reports Server (NTRS)

Lee, Seungwon

2010-01-01

The development of realistic cloud parameterizations for climate models requires accurate characterizations of subgrid distributions of thermodynamic variables. To this end, a software tool was developed to characterize cloud water-content distributions in climate-model sub-grid scales. This software characterizes distributions of cloud water content with respect to cloud phase, cloud type, precipitation occurrence, and geo-location using CloudSat radar measurements. It uses a statistical method called maximum likelihood estimation to estimate the probability density function of the cloud water content.
Generalized ensemble theory with non-extensive statistics

NASA Astrophysics Data System (ADS)

Shen, Ke-Ming; Zhang, Ben-Wei; Wang, En-Ke

2017-12-01

The non-extensive canonical ensemble theory is reconsidered with the method of Lagrange multipliers by maximizing Tsallis entropy, with the constraint that the normalized term of Tsallis' q -average of physical quantities, the sum ∑ pjq, is independent of the probability pi for Tsallis parameter q. The self-referential problem in the deduced probability and thermal quantities in non-extensive statistics is thus avoided, and thermodynamical relationships are obtained in a consistent and natural way. We also extend the study to the non-extensive grand canonical ensemble theory and obtain the q-deformed Bose-Einstein distribution as well as the q-deformed Fermi-Dirac distribution. The theory is further applied to the generalized Planck law to demonstrate the distinct behaviors of the various generalized q-distribution functions discussed in literature.
Survival curve estimation with dependent left truncated data using Cox's model.

PubMed

Mackenzie, Todd

2012-10-19

The Kaplan-Meier and closely related Lynden-Bell estimators are used to provide nonparametric estimation of the distribution of a left-truncated random variable. These estimators assume that the left-truncation variable is independent of the time-to-event. This paper proposes a semiparametric method for estimating the marginal distribution of the time-to-event that does not require independence. It models the conditional distribution of the time-to-event given the truncation variable using Cox's model for left truncated data, and uses inverse probability weighting. We report the results of simulations and illustrate the method using a survival study.

A Bayesian Method for Identifying Contaminated Detectors in Low-Level Alpha Spectrometers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maclellan, Jay A.; Strom, Daniel J.; Joyce, Kevin E.

2011-11-02

Analyses used for radiobioassay and other radiochemical tests are normally designed to meet specified quality objectives, such relative bias, precision, and minimum detectable activity (MDA). In the case of radiobioassay analyses for alpha emitting radionuclides, a major determiner of the process MDA is the instrument background. Alpha spectrometry detectors are often restricted to only a few counts over multi-day periods in order to meet required MDAs for nuclides such as plutonium-239 and americium-241. A detector background criterion is often set empirically based on experience, or frequentist or classical statistics are applied to the calculated background count necessary to meet amore » required MDA. An acceptance criterion for the detector background is set at the multiple of the estimated background standard deviation above the assumed mean that provides an acceptably small probability of observation if the mean and standard deviation estimate are correct. The major problem with this method is that the observed background counts used to estimate the mean, and thereby the standard deviation when a Poisson distribution is assumed, are often in the range of zero to three counts. At those expected count levels it is impossible to obtain a good estimate of the true mean from a single measurement. As an alternative, Bayesian statistical methods allow calculation of the expected detector background count distribution based on historical counts from new, uncontaminated detectors. This distribution can then be used to identify detectors showing an increased probability of contamination. The effect of varying the assumed range of background counts (i.e., the prior probability distribution) from new, uncontaminated detectors will be is discussed.« less
Estimating the Effects of Detection Heterogeneity and Overdispersion on Trends Estimated from Avian Point Counts

EPA Science Inventory

Point counts are a common method for sampling avian distribution and abundance. Though methods for estimating detection probabilities are available, many analyses use raw counts and do not correct for detectability. We use a removal model of detection within an N-mixture approa...
Investigation of Dielectric Breakdown Characteristics for Double-break Vacuum Interrupter and Dielectric Breakdown Probability Distribution in Vacuum Interrupter

NASA Astrophysics Data System (ADS)

Shioiri, Tetsu; Asari, Naoki; Sato, Junichi; Sasage, Kosuke; Yokokura, Kunio; Homma, Mitsutaka; Suzuki, Katsumi

To investigate the reliability of equipment of vacuum insulation, a study was carried out to clarify breakdown probability distributions in vacuum gap. Further, a double-break vacuum circuit breaker was investigated for breakdown probability distribution. The test results show that the breakdown probability distribution of the vacuum gap can be represented by a Weibull distribution using a location parameter, which shows the voltage that permits a zero breakdown probability. The location parameter obtained from Weibull plot depends on electrode area. The shape parameter obtained from Weibull plot of vacuum gap was 10∼14, and is constant irrespective non-uniform field factor. The breakdown probability distribution after no-load switching can be represented by Weibull distribution using a location parameter. The shape parameter after no-load switching was 6∼8.5, and is constant, irrespective of gap length. This indicates that the scatter of breakdown voltage was increased by no-load switching. If the vacuum circuit breaker uses a double break, breakdown probability at low voltage becomes lower than single-break probability. Although potential distribution is a concern in the double-break vacuum cuicuit breaker, its insulation reliability is better than that of the single-break vacuum interrupter even if the bias of the vacuum interrupter's sharing voltage is taken into account.
Using a Betabinomial distribution to estimate the prevalence of adherence to physical activity guidelines among children and youth.

PubMed

Garriguet, Didier

2016-04-01

Estimates of the prevalence of adherence to physical activity guidelines in the population are generally the result of averaging individual probability of adherence based on the number of days people meet the guidelines and the number of days they are assessed. Given this number of active and inactive days (days assessed minus days active), the conditional probability of meeting the guidelines that has been used in the past is a Beta (1 + active days, 1 + inactive days) distribution assuming the probability p of a day being active is bounded by 0 and 1 and averages 50%. A change in the assumption about the distribution of p is required to better match the discrete nature of the data and to better assess the probability of adherence when the percentage of active days in the population differs from 50%. Using accelerometry data from the Canadian Health Measures Survey, the probability of adherence to physical activity guidelines is estimated using a conditional probability given the number of active and inactive days distributed as a Betabinomial(n, a + active days , β + inactive days) assuming that p is randomly distributed as Beta(a, β) where the parameters a and β are estimated by maximum likelihood. The resulting Betabinomial distribution is discrete. For children aged 6 or older, the probability of meeting physical activity guidelines 7 out of 7 days is similar to published estimates. For pre-schoolers, the Betabinomial distribution yields higher estimates of adherence to the guidelines than the Beta distribution, in line with the probability of being active on any given day. In estimating the probability of adherence to physical activity guidelines, the Betabinomial distribution has several advantages over the previously used Beta distribution. It is a discrete distribution and maximizes the richness of accelerometer data.
A Method to Estimate the Probability That Any Individual Cloud-to-Ground Lightning Stroke Was Within Any Radius of Any Point

NASA Technical Reports Server (NTRS)

Huddleston, Lisa L.; Roeder, William P.; Merceret, Francis J.

2010-01-01

A new technique has been developed to estimate the probability that a nearby cloud-to-ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even within the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force station.
Entropy-based goodness-of-fit test: Application to the Pareto distribution

NASA Astrophysics Data System (ADS)

Lequesne, Justine

2013-08-01

Goodness-of-fit tests based on entropy have been introduced in [13] for testing normality. The maximum entropy distribution in a class of probability distributions defined by linear constraints induces a Pythagorean equality between the Kullback-Leibler information and an entropy difference. This allows one to propose a goodness-of-fit test for maximum entropy parametric distributions which is based on the Kullback-Leibler information. We will focus on the application of the method to the Pareto distribution. The power of the proposed test is computed through Monte Carlo simulation.
Estimating occupancy and abundance using aerial images with imperfect detection

USGS Publications Warehouse

Williams, Perry J.; Hooten, Mevin B.; Womble, Jamie N.; Bower, Michael R.

2017-01-01

Species distribution and abundance are critical population characteristics for efficient management, conservation, and ecological insight. Point process models are a powerful tool for modelling distribution and abundance, and can incorporate many data types, including count data, presence-absence data, and presence-only data. Aerial photographic images are a natural tool for collecting data to fit point process models, but aerial images do not always capture all animals that are present at a site. Methods for estimating detection probability for aerial surveys usually include collecting auxiliary data to estimate the proportion of time animals are available to be detected.We developed an approach for fitting point process models using an N-mixture model framework to estimate detection probability for aerial occupancy and abundance surveys. Our method uses multiple aerial images taken of animals at the same spatial location to provide temporal replication of sample sites. The intersection of the images provide multiple counts of individuals at different times. We examined this approach using both simulated and real data of sea otters (Enhydra lutris kenyoni) in Glacier Bay National Park, southeastern Alaska.Using our proposed methods, we estimated detection probability of sea otters to be 0.76, the same as visual aerial surveys that have been used in the past. Further, simulations demonstrated that our approach is a promising tool for estimating occupancy, abundance, and detection probability from aerial photographic surveys.Our methods can be readily extended to data collected using unmanned aerial vehicles, as technology and regulations permit. The generality of our methods for other aerial surveys depends on how well surveys can be designed to meet the assumptions of N-mixture models.
Bayesian analysis of multimodal data and brain imaging

NASA Astrophysics Data System (ADS)

Assadi, Amir H.; Eghbalnia, Hamid; Backonja, Miroslav; Wakai, Ronald T.; Rutecki, Paul; Haughton, Victor

2000-06-01

It is often the case that information about a process can be obtained using a variety of methods. Each method is employed because of specific advantages over the competing alternatives. An example in medical neuro-imaging is the choice between fMRI and MEG modes where fMRI can provide high spatial resolution in comparison to the superior temporal resolution of MEG. The combination of data from varying modes provides the opportunity to infer results that may not be possible by means of any one mode alone. We discuss a Bayesian and learning theoretic framework for enhanced feature extraction that is particularly suited to multi-modal investigations of massive data sets from multiple experiments. In the following Bayesian approach, acquired knowledge (information) regarding various aspects of the process are all directly incorporated into the formulation. This information can come from a variety of sources. In our case, it represents statistical information obtained from other modes of data collection. The information is used to train a learning machine to estimate a probability distribution, which is used in turn by a second machine as a prior, in order to produce a more refined estimation of the distribution of events. The computational demand of the algorithm is handled by proposing a distributed parallel implementation on a cluster of workstations that can be scaled to address real-time needs if required. We provide a simulation of these methods on a set of synthetically generated MEG and EEG data. We show how spatial and temporal resolutions improve by using prior distributions. The method on fMRI signals permits one to construct the probability distribution of the non-linear hemodynamics of the human brain (real data). These computational results are in agreement with biologically based measurements of other labs, as reported to us by researchers from UK. We also provide preliminary analysis involving multi-electrode cortical recording that accompanies behavioral data in pain experiments on freely moving mice subjected to moderate heat delivered by an electric bulb. Summary of new or breakthrough ideas: (1) A new method to estimate probability distribution for measurement of nonlinear hemodynamics of brain from a multi- modal neuronal data. This is the first time that such an idea is tried, to our knowledge. (2) Breakthrough in improvement of time resolution of fMRI signals using (1) above.
Measurement of the top quark mass using template methods on dilepton events in p anti-p collisions at s**(1/2) = 1.96-TeV

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abulencia, A.; Acosta, D.; Adelman, Jahred A.

2006-02-01

The authors describe a measurement of the top quark mass from events produced in p{bar p} collisions at a center-of-mass energy of 1.96 TeV, using the Collider Detector at Fermilab. They identify t{bar t} candidates where both W bosons from the top quarks decay into leptons (e{nu}, {mu}{nu}, or {tau}{nu}) from a data sample of 360 pb{sup -1}. The top quark mass is reconstructed in each event separately by three different methods, which draw upon simulated distributions of the neutrino pseudorapidity, t{bar t} longitudinal momentum, or neutrino azimuthal angle in order to extract probability distributions for the top quark mass.more » For each method, representative mass distributions, or templates, are constructed from simulated samples of signal and background events, and parameterized to form continuous probability density functions. A likelihood fit incorporating these parameterized templates is then performed on the data sample masses in order to derive a final top quark mass. Combining the three template methods, taking into account correlations in their statistical and systematic uncertainties, results in a top quark mass measurement of 170.1 {+-} 6.0(stat.) {+-} 4.1(syst.) GeV/c{sup 2}.« less
Probabilistic Analysis of a Composite Crew Module

NASA Technical Reports Server (NTRS)

Mason, Brian H.; Krishnamurthy, Thiagarajan

2011-01-01

An approach for conducting reliability-based analysis (RBA) of a Composite Crew Module (CCM) is presented. The goal is to identify and quantify the benefits of probabilistic design methods for the CCM and future space vehicles. The coarse finite element model from a previous NASA Engineering and Safety Center (NESC) project is used as the baseline deterministic analysis model to evaluate the performance of the CCM using a strength-based failure index. The first step in the probabilistic analysis process is the determination of the uncertainty distributions for key parameters in the model. Analytical data from water landing simulations are used to develop an uncertainty distribution, but such data were unavailable for other load cases. The uncertainty distributions for the other load scale factors and the strength allowables are generated based on assumed coefficients of variation. Probability of first-ply failure is estimated using three methods: the first order reliability method (FORM), Monte Carlo simulation, and conditional sampling. Results for the three methods were consistent. The reliability is shown to be driven by first ply failure in one region of the CCM at the high altitude abort load set. The final predicted probability of failure is on the order of 10-11 due to the conservative nature of the factors of safety on the deterministic loads.
Valid statistical inference methods for a case-control study with missing data.

PubMed

Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun

2018-04-01

The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.
Count distribution for mixture of two exponentials as renewal process duration with applications

NASA Astrophysics Data System (ADS)

Low, Yeh Ching; Ong, Seng Huat

2016-06-01

A count distribution is presented by considering a renewal process where the distribution of the duration is a finite mixture of exponential distributions. This distribution is able to model over dispersion, a feature often found in observed count data. The computation of the probabilities and renewal function (expected number of renewals) are examined. Parameter estimation by the method of maximum likelihood is considered with applications of the count distribution to real frequency count data exhibiting over dispersion. It is shown that the mixture of exponentials count distribution fits over dispersed data better than the Poisson process and serves as an alternative to the gamma count distribution.
Exposure Models for the Prior Distribution in Bayesian Decision Analysis for Occupational Hygiene Decision Making

PubMed Central

Lee, Eun Gyung; Kim, Seung Won; Feigley, Charles E.; Harper, Martin

2015-01-01

This study introduces two semi-quantitative methods, Structured Subjective Assessment (SSA) and Control of Substances Hazardous to Health (COSHH) Essentials, in conjunction with two-dimensional Monte Carlo simulations for determining prior probabilities. Prior distribution using expert judgment was included for comparison. Practical applications of the proposed methods were demonstrated using personal exposure measurements of isoamyl acetate in an electronics manufacturing facility and of isopropanol in a printing shop. Applicability of these methods in real workplaces was discussed based on the advantages and disadvantages of each method. Although these methods could not be completely independent of expert judgments, this study demonstrated a methodological improvement in the estimation of the prior distribution for the Bayesian decision analysis tool. The proposed methods provide a logical basis for the decision process by considering determinants of worker exposure. PMID:23252451
A Statistical Study of the Mass Distribution of Neutron Stars

NASA Astrophysics Data System (ADS)

Cheng, Zheng; Zhang, Cheng-Min; Zhao, Yong-Heng; Wang, De-Hua; Pan, Yuan-Yue; Lei, Ya-Juan

2014-07-01

By reviewing the methods of mass measurements of neutron stars in four different kinds of systems, i.e., the high-mass X-ray binaries (HMXBs), low-mass X-ray binaries (LMXBs), double neutron star systems (DNSs) and neutron star-white dwarf (NS-WD) binary systems, we have collected the orbital parameters of 40 systems. By using the boot-strap method and the Monte-Carlo method, we have rebuilt the likelihood probability curves of the measured masses of 46 neutron stars. The statistical analysis of the simulation results shows that the masses of neutron stars in the X-ray neutron star systems and those in the radio pulsar systems exhibit different distributions. Besides, the Bayes statistics of these four different kind systems yields the most-probable probability density distributions of these four kind systems to be (1.340 ± 0.230)M8, (1, 505 ± 0.125)M8,(1.335 ± 0.055)M8 and (1.495 ± 0.225)M8, respectively. It is noteworthy that the masses of neutron stars in the HMXB and DNS systems are smaller than those in the other two kind systems by approximately 0.16M8. This result is consistent with the theoretical model of the pulsar to be accelerated to the millisecond order of magnitude via accretion of approximately 0.2M8. If the HMXBs and LMXBs are respectively taken to be the precursors of the BNS and NS-WD systems, then the influence of the accretion effect on the masses of neutron stars in the HMXB systems should be exceedingly small. Their mass distributions should be very close to the initial one during the formation of neutron stars. As for the LMXB and NS-WD systems, they should have already under- gone the process of suffcient accretion, hence there arises rather large deviation from the initial mass distribution.
Combined statistical analysis of landslide release and propagation

NASA Astrophysics Data System (ADS)

Mergili, Martin; Rohmaneo, Mohammad; Chu, Hone-Jay

2016-04-01

Statistical methods - often coupled with stochastic concepts - are commonly employed to relate areas affected by landslides with environmental layers, and to estimate spatial landslide probabilities by applying these relationships. However, such methods only concern the release of landslides, disregarding their motion. Conceptual models for mass flow routing are used for estimating landslide travel distances and possible impact areas. Automated approaches combining release and impact probabilities are rare. The present work attempts to fill this gap by a fully automated procedure combining statistical and stochastic elements, building on the open source GRASS GIS software: (1) The landslide inventory is subset into release and deposition zones. (2) We employ a traditional statistical approach to estimate the spatial release probability of landslides. (3) We back-calculate the probability distribution of the angle of reach of the observed landslides, employing the software tool r.randomwalk. One set of random walks is routed downslope from each pixel defined as release area. Each random walk stops when leaving the observed impact area of the landslide. (4) The cumulative probability function (cdf) derived in (3) is used as input to route a set of random walks downslope from each pixel in the study area through the DEM, assigning the probability gained from the cdf to each pixel along the path (impact probability). The impact probability of a pixel is defined as the average impact probability of all sets of random walks impacting a pixel. Further, the average release probabilities of the release pixels of all sets of random walks impacting a given pixel are stored along with the area of the possible release zone. (5) We compute the zonal release probability by increasing the release probability according to the size of the release zone - the larger the zone, the larger the probability that a landslide will originate from at least one pixel within this zone. We quantify this relationship by a set of empirical curves. (6) Finally, we multiply the zonal release probability with the impact probability in order to estimate the combined impact probability for each pixel. We demonstrate the model with a 167 km² study area in Taiwan, using an inventory of landslides triggered by the typhoon Morakot. Analyzing the model results leads us to a set of key conclusions: (i) The average composite impact probability over the entire study area corresponds well to the density of observed landside pixels. Therefore we conclude that the method is valid in general, even though the concept of the zonal release probability bears some conceptual issues that have to be kept in mind. (ii) The parameters used as predictors cannot fully explain the observed distribution of landslides. The size of the release zone influences the composite impact probability to a larger degree than the pixel-based release probability. (iii) The prediction rate increases considerably when excluding the largest, deep-seated, landslides from the analysis. We conclude that such landslides are mainly related to geological features hardly reflected in the predictor layers used.
Electron number probability distributions for correlated wave functions.

PubMed

Francisco, E; Martín Pendás, A; Blanco, M A

2007-03-07

Efficient formulas for computing the probability of finding exactly an integer number of electrons in an arbitrarily chosen volume are only known for single-determinant wave functions [E. Cances et al., Theor. Chem. Acc. 111, 373 (2004)]. In this article, an algebraic method is presented that extends these formulas to the case of multideterminant wave functions and any number of disjoint volumes. The derived expressions are applied to compute the probabilities within the atomic domains derived from the space partitioning based on the quantum theory of atoms in molecules. Results for a series of test molecules are presented, paying particular attention to the effects of electron correlation and of some numerical approximations on the computed probabilities.
Modeling stream fish distributions using interval-censored detection times.

PubMed

Ferreira, Mário; Filipe, Ana Filipa; Bardos, David C; Magalhães, Maria Filomena; Beja, Pedro

2016-08-01

Controlling for imperfect detection is important for developing species distribution models (SDMs). Occupancy-detection models based on the time needed to detect a species can be used to address this problem, but this is hindered when times to detection are not known precisely. Here, we extend the time-to-detection model to deal with detections recorded in time intervals and illustrate the method using a case study on stream fish distribution modeling. We collected electrofishing samples of six fish species across a Mediterranean watershed in Northeast Portugal. Based on a Bayesian hierarchical framework, we modeled the probability of water presence in stream channels, and the probability of species occupancy conditional on water presence, in relation to environmental and spatial variables. We also modeled time-to-first detection conditional on occupancy in relation to local factors, using modified interval-censored exponential survival models. Posterior distributions of occupancy probabilities derived from the models were used to produce species distribution maps. Simulations indicated that the modified time-to-detection model provided unbiased parameter estimates despite interval-censoring. There was a tendency for spatial variation in detection rates to be primarily influenced by depth and, to a lesser extent, stream width. Species occupancies were consistently affected by stream order, elevation, and annual precipitation. Bayesian P-values and AUCs indicated that all models had adequate fit and high discrimination ability, respectively. Mapping of predicted occupancy probabilities showed widespread distribution by most species, but uncertainty was generally higher in tributaries and upper reaches. The interval-censored time-to-detection model provides a practical solution to model occupancy-detection when detections are recorded in time intervals. This modeling framework is useful for developing SDMs while controlling for variation in detection rates, as it uses simple data that can be readily collected by field ecologists.
Bayesian image reconstruction - The pixon and optimal image modeling

NASA Technical Reports Server (NTRS)

Pina, R. K.; Puetter, R. C.

1993-01-01

In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
On the SIMS Ionization Probability of Organic Molecules.

PubMed

Popczun, Nicholas J; Breuer, Lars; Wucher, Andreas; Winograd, Nicholas

2017-06-01

The prospect of improved secondary ion yields for secondary ion mass spectrometry (SIMS) experiments drives innovation of new primary ion sources, instrumentation, and post-ionization techniques. The largest factor affecting secondary ion efficiency is believed to be the poor ionization probability (α + ) of sputtered material, a value rarely measured directly, but estimated to be in some cases as low as 10 -5 . Our lab has developed a method for the direct determination of α + in a SIMS experiment using laser post-ionization (LPI) to detect neutral molecular species in the sputtered plume for an organic compound. Here, we apply this method to coronene (C 24 H 12 ), a polyaromatic hydrocarbon that exhibits strong molecular signal during gas-phase photoionization. A two-dimensional spatial distribution of sputtered neutral molecules is measured and presented. It is shown that the ionization probability of molecular coronene desorbed from a clean film under bombardment with 40 keV C 60 cluster projectiles is of the order of 10 -3 , with some remaining uncertainty arising from laser-induced fragmentation and possible differences in the emission velocity distributions of neutral and ionized molecules. In general, this work establishes a method to estimate the ionization efficiency of molecular species sputtered during a single bombardment event. Graphical Abstract .
Information retrieval from wide-band meteorological data - An example

NASA Technical Reports Server (NTRS)

Adelfang, S. I.; Smith, O. E.

1983-01-01

The methods proposed by Smith and Adelfang (1981) and Smith et al. (1982) are used to calculate probabilities over rectangles and sectors of the gust magnitude-gust length plane; probabilities over the same regions are also calculated from the observed distributions and a comparison is also presented to demonstrate the accuracy of the statistical model. These and other statistical results are calculated from samples of Jimsphere wind profiles at Cape Canaveral. The results are presented for a variety of wavelength bands, altitudes, and seasons. It is shown that wind perturbations observed in Jimsphere wind profiles in various wavelength bands can be analyzed by using digital filters. The relationship between gust magnitude and gust length is modeled with the bivariate gamma distribution. It is pointed out that application of the model to calculate probabilities over specific areas of the gust magnitude-gust length plane can be useful in aerospace design.

Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis.

PubMed

Chiba, Tomoaki; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

2017-01-01

In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group's sales beat GM's sales, which is a reasonable scenario.
Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis

PubMed Central

Chiba, Tomoaki; Akaho, Shotaro; Murata, Noboru

2017-01-01

In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group’s sales beat GM’s sales, which is a reasonable scenario. PMID:28076383
The estimation of lower refractivity uncertainty from radar sea clutter using the Bayesian—MCMC method

NASA Astrophysics Data System (ADS)

Sheng, Zheng

2013-02-01

The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.
Dynamic probability control limits for risk-adjusted Bernoulli CUSUM charts.

PubMed

Zhang, Xiang; Woodall, William H

2015-11-10

The risk-adjusted Bernoulli cumulative sum (CUSUM) chart developed by Steiner et al. (2000) is an increasingly popular tool for monitoring clinical and surgical performance. In practice, however, the use of a fixed control limit for the chart leads to a quite variable in-control average run length performance for patient populations with different risk score distributions. To overcome this problem, we determine simulation-based dynamic probability control limits (DPCLs) patient-by-patient for the risk-adjusted Bernoulli CUSUM charts. By maintaining the probability of a false alarm at a constant level conditional on no false alarm for previous observations, our risk-adjusted CUSUM charts with DPCLs have consistent in-control performance at the desired level with approximately geometrically distributed run lengths. Our simulation results demonstrate that our method does not rely on any information or assumptions about the patients' risk distributions. The use of DPCLs for risk-adjusted Bernoulli CUSUM charts allows each chart to be designed for the corresponding particular sequence of patients for a surgeon or hospital. Copyright © 2015 John Wiley & Sons, Ltd.
Expert Elicitations of 2100 Emission of CO2

NASA Astrophysics Data System (ADS)

Ho, Emily; Bosetti, Valentina; Budescu, David; Keller, Klaus; van Vuuren, Detlef

2017-04-01

Emission scenarios such as Shared Socioeconomic Pathways (SSPs) and Representative Concentration Pathways (RCPs) are used intensively for climate research (e.g. climate change projections) and policy analysis. While the range of these scenarios provides an indication of uncertainty, these scenarios are typically not associated with probability values. Some studies (e.g. Vuuren et al, 2007; Gillingham et al., 2015) took a different approach associating baseline emission pathways (conditionally) with probability distributions. This paper summarizes three studies where climate change experts were asked to conduct pair-wise comparisons of possible ranges of 2100 greenhouse gas emissions and rate the relative likelihood of the ranges. The elicitation was performed under two sets of assumptions: 1) a situation where no climate policies are introduced beyond the ones already in place (baseline scenario), and 2) a situation in which countries have ratified the voluntary policies in line with the long term target embedded in the 2015 Paris Agreement. These indirect relative judgments were used to construct subjective cumulative distribution functions. We show that by using a ratio scaling method that invokes relative likelihoods of scenarios, a subjective probability distribution can be derived for each expert that expresses their beliefs in the projected greenhouse gas emissions range in 2100. This method is shown to elicit stable estimates that require minimal adjustment and is relatively invariant to the partition of the domain of interest. Experts also rated the method as being easy and intuitive to use. We also report results of a study that allowed participants to choose their own ranges of greenhouse gas emissions to remove potential anchoring bias. We discuss the implications of the use of this method for facilitating comparison and communication of beliefs among diverse users of climate science research.
Standardized likelihood ratio test for comparing several log-normal means and confidence interval for the common mean.

PubMed

Krishnamoorthy, K; Oral, Evrim

2017-12-01

Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.
Examples of measurement uncertainty evaluations in accordance with the revised GUM

NASA Astrophysics Data System (ADS)

Runje, B.; Horvatic, A.; Alar, V.; Medic, S.; Bosnjakovic, A.

2016-11-01

The paper presents examples of the evaluation of uncertainty components in accordance with the current and revised Guide to the expression of uncertainty in measurement (GUM). In accordance with the proposed revision of the GUM a Bayesian approach was conducted for both type A and type B evaluations.The law of propagation of uncertainty (LPU) and the law of propagation of distribution applied through the Monte Carlo method, (MCM) were used to evaluate associated standard uncertainties, expanded uncertainties and coverage intervals. Furthermore, the influence of the non-Gaussian dominant input quantity and asymmetric distribution of the output quantity y on the evaluation of measurement uncertainty was analyzed. In the case when the probabilistically coverage interval is not symmetric, the coverage interval for the probability P is estimated from the experimental probability density function using the Monte Carlo method. Key highlights of the proposed revision of the GUM were analyzed through a set of examples.
Space Object Collision Probability via Monte Carlo on the Graphics Processing Unit

NASA Astrophysics Data System (ADS)

Vittaldev, Vivek; Russell, Ryan P.

2017-09-01

Fast and accurate collision probability computations are essential for protecting space assets. Monte Carlo (MC) simulation is the most accurate but computationally intensive method. A Graphics Processing Unit (GPU) is used to parallelize the computation and reduce the overall runtime. Using MC techniques to compute the collision probability is common in literature as the benchmark. An optimized implementation on the GPU, however, is a challenging problem and is the main focus of the current work. The MC simulation takes samples from the uncertainty distributions of the Resident Space Objects (RSOs) at any time during a time window of interest and outputs the separations at closest approach. Therefore, any uncertainty propagation method may be used and the collision probability is automatically computed as a function of RSO collision radii. Integration using a fixed time step and a quartic interpolation after every Runge Kutta step ensures that no close approaches are missed. Two orders of magnitude speedups over a serial CPU implementation are shown, and speedups improve moderately with higher fidelity dynamics. The tool makes the MC approach tractable on a single workstation, and can be used as a final product, or for verifying surrogate and analytical collision probability methods.
Efficient Iris Recognition Based on Optimal Subfeature Selection and Weighted Subregion Fusion

PubMed Central

Deng, Ning

2014-01-01

In this paper, we propose three discriminative feature selection strategies and weighted subregion matching method to improve the performance of iris recognition system. Firstly, we introduce the process of feature extraction and representation based on scale invariant feature transformation (SIFT) in detail. Secondly, three strategies are described, which are orientation probability distribution function (OPDF) based strategy to delete some redundant feature keypoints, magnitude probability distribution function (MPDF) based strategy to reduce dimensionality of feature element, and compounded strategy combined OPDF and MPDF to further select optimal subfeature. Thirdly, to make matching more effective, this paper proposes a novel matching method based on weighted sub-region matching fusion. Particle swarm optimization is utilized to accelerate achieve different sub-region's weights and then weighted different subregions' matching scores to generate the final decision. The experimental results, on three public and renowned iris databases (CASIA-V3 Interval, Lamp, andMMU-V1), demonstrate that our proposed methods outperform some of the existing methods in terms of correct recognition rate, equal error rate, and computation complexity. PMID:24683317
Bayesian Parameter Estimation for Heavy-Duty Vehicles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, Eric; Konan, Arnaud; Duran, Adam

2017-03-28

Accurate vehicle parameters are valuable for design, modeling, and reporting. Estimating vehicle parameters can be a very time-consuming process requiring tightly-controlled experimentation. This work describes a method to estimate vehicle parameters such as mass, coefficient of drag/frontal area, and rolling resistance using data logged during standard vehicle operation. The method uses Monte Carlo to generate parameter sets which is fed to a variant of the road load equation. Modeled road load is then compared to measured load to evaluate the probability of the parameter set. Acceptance of a proposed parameter set is determined using the probability ratio to the currentmore » state, so that the chain history will give a distribution of parameter sets. Compared to a single value, a distribution of possible values provides information on the quality of estimates and the range of possible parameter values. The method is demonstrated by estimating dynamometer parameters. Results confirm the method's ability to estimate reasonable parameter sets, and indicates an opportunity to increase the certainty of estimates through careful selection or generation of the test drive cycle.« less
Generation of Stationary Non-Gaussian Time Histories with a Specified Cross-spectral Density

DOE PAGES

Smallwood, David O.

1997-01-01

The paper reviews several methods for the generation of stationary realizations of sampled time histories with non-Gaussian distributions and introduces a new method which can be used to control the cross-spectral density matrix and the probability density functions (pdfs) of the multiple input problem. Discussed first are two methods for the specialized case of matching the auto (power) spectrum, the skewness, and kurtosis using generalized shot noise and using polynomial functions. It is then shown that the skewness and kurtosis can also be controlled by the phase of a complex frequency domain description of the random process. The general casemore » of matching a target probability density function using a zero memory nonlinear (ZMNL) function is then covered. Next methods for generating vectors of random variables with a specified covariance matrix for a class of spherically invariant random vectors (SIRV) are discussed. Finally the general case of matching the cross-spectral density matrix of a vector of inputs with non-Gaussian marginal distributions is presented.« less
Automated segmentation of linear time-frequency representations of marine-mammal sounds.

PubMed

Dadouchi, Florian; Gervaise, Cedric; Ioana, Cornel; Huillery, Julien; Mars, Jérôme I

2013-09-01

Many marine mammals produce highly nonlinear frequency modulations. Determining the time-frequency support of these sounds offers various applications, which include recognition, localization, and density estimation. This study introduces a low parameterized automated spectrogram segmentation method that is based on a theoretical probabilistic framework. In the first step, the background noise in the spectrogram is fitted with a Chi-squared distribution and thresholded using a Neyman-Pearson approach. In the second step, the number of false detections in time-frequency regions is modeled as a binomial distribution, and then through a Neyman-Pearson strategy, the time-frequency bins are gathered into regions of interest. The proposed method is validated on real data of large sequences of whistles from common dolphins, collected in the Bay of Biscay (France). The proposed method is also compared with two alternative approaches: the first is smoothing and thresholding of the spectrogram; the second is thresholding of the spectrogram followed by the use of morphological operators to gather the time-frequency bins and to remove false positives. This method is shown to increase the probability of detection for the same probability of false alarms.
Efficient iris recognition based on optimal subfeature selection and weighted subregion fusion.

PubMed

Chen, Ying; Liu, Yuanning; Zhu, Xiaodong; He, Fei; Wang, Hongye; Deng, Ning

2014-01-01

In this paper, we propose three discriminative feature selection strategies and weighted subregion matching method to improve the performance of iris recognition system. Firstly, we introduce the process of feature extraction and representation based on scale invariant feature transformation (SIFT) in detail. Secondly, three strategies are described, which are orientation probability distribution function (OPDF) based strategy to delete some redundant feature keypoints, magnitude probability distribution function (MPDF) based strategy to reduce dimensionality of feature element, and compounded strategy combined OPDF and MPDF to further select optimal subfeature. Thirdly, to make matching more effective, this paper proposes a novel matching method based on weighted sub-region matching fusion. Particle swarm optimization is utilized to accelerate achieve different sub-region's weights and then weighted different subregions' matching scores to generate the final decision. The experimental results, on three public and renowned iris databases (CASIA-V3 Interval, Lamp, and MMU-V1), demonstrate that our proposed methods outperform some of the existing methods in terms of correct recognition rate, equal error rate, and computation complexity.
Statistical deprojection of galaxy pairs

NASA Astrophysics Data System (ADS)

Nottale, Laurent; Chamaraux, Pierre

2018-06-01

Aims: The purpose of the present paper is to provide methods of statistical analysis of the physical properties of galaxy pairs. We perform this study to apply it later to catalogs of isolated pairs of galaxies, especially two new catalogs we recently constructed that contain ≈1000 and ≈13 000 pairs, respectively. We are particularly interested by the dynamics of those pairs, including the determination of their masses. Methods: We could not compute the dynamical parameters directly since the necessary data are incomplete. Indeed, we only have at our disposal one component of the intervelocity between the members, namely along the line of sight, and two components of their interdistance, i.e., the projection on the sky-plane. Moreover, we know only one point of each galaxy orbit. Hence we need statistical methods to find the probability distribution of 3D interdistances and 3D intervelocities from their projections; we designed those methods under the term deprojection. Results: We proceed in two steps to determine and use the deprojection methods. First we derive the probability distributions expected for the various relevant projected quantities, namely intervelocity vz, interdistance rp, their ratio, and the product rp v_z^2, which is involved in mass determination. In a second step, we propose various methods of deprojection of those parameters based on the previous analysis. We start from a histogram of the projected data and we apply inversion formulae to obtain the deprojected distributions; lastly, we test the methods by numerical simulations, which also allow us to determine the uncertainties involved.
Bivariate Rainfall and Runoff Analysis Using Shannon Entropy Theory

NASA Astrophysics Data System (ADS)

Rahimi, A.; Zhang, L.

2012-12-01

Rainfall-Runoff analysis is the key component for many hydrological and hydraulic designs in which the dependence of rainfall and runoff needs to be studied. It is known that the convenient bivariate distribution are often unable to model the rainfall-runoff variables due to that they either have constraints on the range of the dependence or fixed form for the marginal distributions. Thus, this paper presents an approach to derive the entropy-based joint rainfall-runoff distribution using Shannon entropy theory. The distribution derived can model the full range of dependence and allow different specified marginals. The modeling and estimation can be proceeded as: (i) univariate analysis of marginal distributions which includes two steps, (a) using the nonparametric statistics approach to detect modes and underlying probability density, and (b) fitting the appropriate parametric probability density functions; (ii) define the constraints based on the univariate analysis and the dependence structure; (iii) derive and validate the entropy-based joint distribution. As to validate the method, the rainfall-runoff data are collected from the small agricultural experimental watersheds located in semi-arid region near Riesel (Waco), Texas, maintained by the USDA. The results of unviariate analysis show that the rainfall variables follow the gamma distribution, whereas the runoff variables have mixed structure and follow the mixed-gamma distribution. With this information, the entropy-based joint distribution is derived using the first moments, the first moments of logarithm transformed rainfall and runoff, and the covariance between rainfall and runoff. The results of entropy-based joint distribution indicate: (1) the joint distribution derived successfully preserves the dependence between rainfall and runoff, and (2) the K-S goodness of fit statistical tests confirm the marginal distributions re-derived reveal the underlying univariate probability densities which further assure that the entropy-based joint rainfall-runoff distribution are satisfactorily derived. Overall, the study shows the Shannon entropy theory can be satisfactorily applied to model the dependence between rainfall and runoff. The study also shows that the entropy-based joint distribution is an appropriate approach to capture the dependence structure that cannot be captured by the convenient bivariate joint distributions. Joint Rainfall-Runoff Entropy Based PDF, and Corresponding Marginal PDF and Histogram for W12 Watershed The K-S Test Result and RMSE on Univariate Distributions Derived from the Maximum Entropy Based Joint Probability Distribution;
A brief introduction to probability.

PubMed

Di Paola, Gioacchino; Bertani, Alessandro; De Monte, Lavinia; Tuzzolino, Fabio

2018-02-01

The theory of probability has been debated for centuries: back in 1600, French mathematics used the rules of probability to place and win bets. Subsequently, the knowledge of probability has significantly evolved and is now an essential tool for statistics. In this paper, the basic theoretical principles of probability will be reviewed, with the aim of facilitating the comprehension of statistical inference. After a brief general introduction on probability, we will review the concept of the "probability distribution" that is a function providing the probabilities of occurrence of different possible outcomes of a categorical or continuous variable. Specific attention will be focused on normal distribution that is the most relevant distribution applied to statistical analysis.
Statistical computation of tolerance limits

NASA Technical Reports Server (NTRS)

Wheeler, J. T.

1993-01-01

Based on a new theory, two computer codes were developed specifically to calculate the exact statistical tolerance limits for normal distributions within unknown means and variances for the one-sided and two-sided cases for the tolerance factor, k. The quantity k is defined equivalently in terms of the noncentral t-distribution by the probability equation. Two of the four mathematical methods employ the theory developed for the numerical simulation. Several algorithms for numerically integrating and iteratively root-solving the working equations are written to augment the program simulation. The program codes generate some tables of k's associated with the varying values of the proportion and sample size for each given probability to show accuracy obtained for small sample sizes.
Emergence and stability of intermediate open vesicles in disk-to-vesicle transitions.

PubMed

Li, Jianfeng; Zhang, Hongdong; Qiu, Feng; Shi, An-Chang

2013-07-01

The transition between two basic structures, a disk and an enclosed vesicle, of a finite membrane is studied by examining the minimum energy path (MEP) connecting these two states. The MEP is constructed using the string method applied to continuum elastic membrane models. The results reveal that, besides the commonly observed disk and vesicle, open vesicles (bowl-shaped vesicles or vesicles with a pore) can become stable or metastable shapes. The emergence, stability, and probability distribution of these open vesicles are analyzed. It is demonstrated that open vesicles can be stabilized by higher-order elastic energies. The estimated probability distribution of the different structures is in good agreement with available experiments.
A Method to Estimate the Probability That Any Individual Lightning Stroke Contacted the Surface Within Any Radius of Any Point

NASA Technical Reports Server (NTRS)

Huddleston, Lisa L.; Roeder, William; Merceret, Francis J.

2010-01-01

A technique has been developed to calculate the probability that any nearby lightning stroke is within any radius of any point of interest. In practice, this provides the probability that a nearby lightning stroke was within a key distance of a facility, rather than the error ellipses centered on the stroke. This process takes the current bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to get the probability that the stroke is inside any specified radius. This new facility-centric technique will be much more useful to the space launch customers and may supersede the lightning error ellipse approach discussed in [5], [6].
Poisson statistics of PageRank probabilities of Twitter and Wikipedia networks

NASA Astrophysics Data System (ADS)

Frahm, Klaus M.; Shepelyansky, Dima L.

2014-04-01

We use the methods of quantum chaos and Random Matrix Theory for analysis of statistical fluctuations of PageRank probabilities in directed networks. In this approach the effective energy levels are given by a logarithm of PageRank probability at a given node. After the standard energy level unfolding procedure we establish that the nearest spacing distribution of PageRank probabilities is described by the Poisson law typical for integrable quantum systems. Our studies are done for the Twitter network and three networks of Wikipedia editions in English, French and German. We argue that due to absence of level repulsion the PageRank order of nearby nodes can be easily interchanged. The obtained Poisson law implies that the nearby PageRank probabilities fluctuate as random independent variables.

Effective classification of the prevalence of Schistosoma mansoni.

PubMed

Mitchell, Shira A; Pagano, Marcello

2012-12-01

To present an effective classification method based on the prevalence of Schistosoma mansoni in the community. We created decision rules (defined by cut-offs for number of positive slides), which account for imperfect sensitivity, both with a simple adjustment of fixed sensitivity and with a more complex adjustment of changing sensitivity with prevalence. To reduce screening costs while maintaining accuracy, we propose a pooled classification method. To estimate sensitivity, we use the De Vlas model for worm and egg distributions. We compare the proposed method with the standard method to investigate differences in efficiency, measured by number of slides read, and accuracy, measured by probability of correct classification. Modelling varying sensitivity lowers the lower cut-off more significantly than the upper cut-off, correctly classifying regions as moderate rather than lower, thus receiving life-saving treatment. The classification method goes directly to classification on the basis of positive pools, avoiding having to know sensitivity to estimate prevalence. For model parameter values describing worm and egg distributions among children, the pooled method with 25 slides achieves an expected 89.9% probability of correct classification, whereas the standard method with 50 slides achieves 88.7%. Among children, it is more efficient and more accurate to use the pooled method for classification of S. mansoni prevalence than the current standard method. © 2012 Blackwell Publishing Ltd.
Joint distribution of temperature and precipitation in the Mediterranean, using the Copula method

NASA Astrophysics Data System (ADS)

Lazoglou, Georgia; Anagnostopoulou, Christina

2018-03-01

This study analyses the temperature and precipitation dependence among stations in the Mediterranean. The first station group is located in the eastern Mediterranean (EM) and includes two stations, Athens and Thessaloniki, while the western (WM) one includes Malaga and Barcelona. The data was organized in two time periods, the hot-dry period and the cold-wet one, composed of 5 months, respectively. The analysis is based on a new statistical technique in climatology: the Copula method. Firstly, the calculation of the Kendall tau correlation index showed that temperatures among stations are dependant during both time periods whereas precipitation presents dependency only between the stations located in EM or WM and only during the cold-wet period. Accordingly, the marginal distributions were calculated for each studied station, as they are further used by the copula method. Finally, several copula families, both Archimedean and Elliptical, were tested in order to choose the most appropriate one to model the relation of the studied data sets. Consequently, this study achieves to model the dependence of the main climate parameters (temperature and precipitation) with the Copula method. The Frank copula was identified as the best family to describe the joint distribution of temperature, for the majority of station groups. For precipitation, the best copula families are BB1 and Survival Gumbel. Using the probability distribution diagrams, the probability of a combination of temperature and precipitation values between stations is estimated.
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics

NASA Astrophysics Data System (ADS)

Abe, Sumiyoshi

2014-11-01

The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
Optimal nonlinear filtering using the finite-volume method

NASA Astrophysics Data System (ADS)

Fox, Colin; Morrison, Malcolm E. K.; Norton, Richard A.; Molteno, Timothy C. A.

2018-01-01

Optimal sequential inference, or filtering, for the state of a deterministic dynamical system requires simulation of the Frobenius-Perron operator, that can be formulated as the solution of a continuity equation. For low-dimensional, smooth systems, the finite-volume numerical method provides a solution that conserves probability and gives estimates that converge to the optimal continuous-time values, while a Courant-Friedrichs-Lewy-type condition assures that intermediate discretized solutions remain positive density functions. This method is demonstrated in an example of nonlinear filtering for the state of a simple pendulum, with comparison to results using the unscented Kalman filter, and for a case where rank-deficient observations lead to multimodal probability distributions.
Performance of toxicity probability interval based designs in contrast to the continual reassessment method

PubMed Central

Horton, Bethany Jablonski; Wages, Nolan A.; Conaway, Mark R.

2016-01-01

Toxicity probability interval designs have received increasing attention as a dose-finding method in recent years. In this study, we compared the two-stage, likelihood-based continual reassessment method (CRM), modified toxicity probability interval (mTPI), and the Bayesian optimal interval design (BOIN) in order to evaluate each method's performance in dose selection for Phase I trials. We use several summary measures to compare the performance of these methods, including percentage of correct selection (PCS) of the true maximum tolerable dose (MTD), allocation of patients to doses at and around the true MTD, and an accuracy index. This index is an efficiency measure that describes the entire distribution of MTD selection and patient allocation by taking into account the distance between the true probability of toxicity at each dose level and the target toxicity rate. The simulation study considered a broad range of toxicity curves and various sample sizes. When considering PCS, we found that CRM outperformed the two competing methods in most scenarios, followed by BOIN, then mTPI. We observed a similar trend when considering the accuracy index for dose allocation, where CRM most often outperformed both the mTPI and BOIN. These trends were more pronounced with increasing number of dose levels. PMID:27435150
Target Tracking Using SePDAF under Ambiguous Angles for Distributed Array Radar

PubMed Central

Long, Teng; Zhang, Honggang; Zeng, Tao; Chen, Xinliang; Liu, Quanhua; Zheng, Le

2016-01-01

Distributed array radar can improve radar detection capability and measurement accuracy. However, it will suffer cyclic ambiguity in its angle estimates according to the spatial Nyquist sampling theorem since the large sparse array is undersampling. Consequently, the state estimation accuracy and track validity probability degrades when the ambiguous angles are directly used for target tracking. This paper proposes a second probability data association filter (SePDAF)-based tracking method for distributed array radar. Firstly, the target motion model and radar measurement model is built. Secondly, the fusion result of each radar’s estimation is employed to the extended Kalman filter (EKF) to finish the first filtering. Thirdly, taking this result as prior knowledge, and associating with the array-processed ambiguous angles, the SePDAF is applied to accomplish the second filtering, and then achieving a high accuracy and stable trajectory with relatively low computational complexity. Moreover, the azimuth filtering accuracy will be promoted dramatically and the position filtering accuracy will also improve. Finally, simulations illustrate the effectiveness of the proposed method. PMID:27618058
On the use of Bayesian Monte-Carlo in evaluation of nuclear data

NASA Astrophysics Data System (ADS)

De Saint Jean, Cyrille; Archier, Pascal; Privas, Edwin; Noguere, Gilles

2017-09-01

As model parameters, necessary ingredients of theoretical models, are not always predicted by theory, a formal mathematical framework associated to the evaluation work is needed to obtain the best set of parameters (resonance parameters, optical models, fission barrier, average width, multigroup cross sections) with Bayesian statistical inference by comparing theory to experiment. The formal rule related to this methodology is to estimate the posterior density probability function of a set of parameters by solving an equation of the following type: pdf(posterior) ˜ pdf(prior) × a likelihood function. A fitting procedure can be seen as an estimation of the posterior density probability of a set of parameters (referred as x→?) knowing a prior information on these parameters and a likelihood which gives the probability density function of observing a data set knowing x→?. To solve this problem, two major paths could be taken: add approximations and hypothesis and obtain an equation to be solved numerically (minimum of a cost function or Generalized least Square method, referred as GLS) or use Monte-Carlo sampling of all prior distributions and estimate the final posterior distribution. Monte Carlo methods are natural solution for Bayesian inference problems. They avoid approximations (existing in traditional adjustment procedure based on chi-square minimization) and propose alternative in the choice of probability density distribution for priors and likelihoods. This paper will propose the use of what we are calling Bayesian Monte Carlo (referred as BMC in the rest of the manuscript) in the whole energy range from thermal, resonance and continuum range for all nuclear reaction models at these energies. Algorithms will be presented based on Monte-Carlo sampling and Markov chain. The objectives of BMC are to propose a reference calculation for validating the GLS calculations and approximations, to test probability density distributions effects and to provide the framework of finding global minimum if several local minimums exist. Application to resolved resonance, unresolved resonance and continuum evaluation as well as multigroup cross section data assimilation will be presented.
Site-to-Source Finite Fault Distance Probability Distribution in Probabilistic Seismic Hazard and the Relationship Between Minimum Distances

NASA Astrophysics Data System (ADS)

Ortega, R.; Gutierrez, E.; Carciumaru, D. D.; Huesca-Perez, E.

2017-12-01

We present a method to compute the conditional and no-conditional probability density function (PDF) of the finite fault distance distribution (FFDD). Two cases are described: lines and areas. The case of lines has a simple analytical solution while, in the case of areas, the geometrical probability of a fault based on the strike, dip, and fault segment vertices is obtained using the projection of spheres in a piecewise rectangular surface. The cumulative distribution is computed by measuring the projection of a sphere of radius r in an effective area using an algorithm that estimates the area of a circle within a rectangle. In addition, we introduce the finite fault distance metrics. This distance is the distance where the maximum stress release occurs within the fault plane and generates a peak ground motion. Later, we can apply the appropriate ground motion prediction equations (GMPE) for PSHA. The conditional probability of distance given magnitude is also presented using different scaling laws. A simple model of constant distribution of the centroid at the geometrical mean is discussed, in this model hazard is reduced at the edges because the effective size is reduced. Nowadays there is a trend of using extended source distances in PSHA, however it is not possible to separate the fault geometry from the GMPE. With this new approach, it is possible to add fault rupture models separating geometrical and propagation effects.
Serial Spike Time Correlations Affect Probability Distribution of Joint Spike Events.

PubMed

Shahi, Mina; van Vreeswijk, Carl; Pipa, Gordon

2016-01-01

Detecting the existence of temporally coordinated spiking activity, and its role in information processing in the cortex, has remained a major challenge for neuroscience research. Different methods and approaches have been suggested to test whether the observed synchronized events are significantly different from those expected by chance. To analyze the simultaneous spike trains for precise spike correlation, these methods typically model the spike trains as a Poisson process implying that the generation of each spike is independent of all the other spikes. However, studies have shown that neural spike trains exhibit dependence among spike sequences, such as the absolute and relative refractory periods which govern the spike probability of the oncoming action potential based on the time of the last spike, or the bursting behavior, which is characterized by short epochs of rapid action potentials, followed by longer episodes of silence. Here we investigate non-renewal processes with the inter-spike interval distribution model that incorporates spike-history dependence of individual neurons. For that, we use the Monte Carlo method to estimate the full shape of the coincidence count distribution and to generate false positives for coincidence detection. The results show that compared to the distributions based on homogeneous Poisson processes, and also non-Poisson processes, the width of the distribution of joint spike events changes. Non-renewal processes can lead to both heavy tailed or narrow coincidence distribution. We conclude that small differences in the exact autostructure of the point process can cause large differences in the width of a coincidence distribution. Therefore, manipulations of the autostructure for the estimation of significance of joint spike events seem to be inadequate.
Serial Spike Time Correlations Affect Probability Distribution of Joint Spike Events

PubMed Central

Shahi, Mina; van Vreeswijk, Carl; Pipa, Gordon

2016-01-01

Detecting the existence of temporally coordinated spiking activity, and its role in information processing in the cortex, has remained a major challenge for neuroscience research. Different methods and approaches have been suggested to test whether the observed synchronized events are significantly different from those expected by chance. To analyze the simultaneous spike trains for precise spike correlation, these methods typically model the spike trains as a Poisson process implying that the generation of each spike is independent of all the other spikes. However, studies have shown that neural spike trains exhibit dependence among spike sequences, such as the absolute and relative refractory periods which govern the spike probability of the oncoming action potential based on the time of the last spike, or the bursting behavior, which is characterized by short epochs of rapid action potentials, followed by longer episodes of silence. Here we investigate non-renewal processes with the inter-spike interval distribution model that incorporates spike-history dependence of individual neurons. For that, we use the Monte Carlo method to estimate the full shape of the coincidence count distribution and to generate false positives for coincidence detection. The results show that compared to the distributions based on homogeneous Poisson processes, and also non-Poisson processes, the width of the distribution of joint spike events changes. Non-renewal processes can lead to both heavy tailed or narrow coincidence distribution. We conclude that small differences in the exact autostructure of the point process can cause large differences in the width of a coincidence distribution. Therefore, manipulations of the autostructure for the estimation of significance of joint spike events seem to be inadequate. PMID:28066225
Towards a theoretical determination of the geographical probability distribution of meteoroid impacts on Earth

NASA Astrophysics Data System (ADS)

Zuluaga, Jorge I.; Sucerquia, Mario

2018-06-01

Tunguska and Chelyabinsk impact events occurred inside a geographical area of only 3.4 per cent of the Earth's surface. Although two events hardly constitute a statistically significant demonstration of a geographical pattern of impacts, their spatial coincidence is at least tantalizing. To understand if this concurrence reflects an underlying geographical and/or temporal pattern, we must aim at predicting the spatio-temporal distribution of meteoroid impacts on Earth. For this purpose we designed, implemented, and tested a novel numerical technique, the `Gravitational Ray Tracing' (GRT) designed to compute the relative impact probability (RIP) on the surface of any planet. GRT is inspired by the so-called ray-casting techniques used to render realistic images of complex 3D scenes. In this paper we describe the method and the results of testing it at the time of large impact events. Our findings suggest a non-trivial pattern of impact probabilities at any given time on the Earth. Locations at 60-90° from the apex are more prone to impacts, especially at midnight. Counterintuitively, sites close to apex direction have the lowest RIP, while in the antapex RIP are slightly larger than average. We present here preliminary maps of RIP at the time of Tunguska and Chelyabinsk events and found no evidence of a spatial or temporal pattern, suggesting that their coincidence was fortuitous. We apply the GRT method to compute theoretical RIP at the location and time of 394 large fireballs. Although the predicted spatio-temporal impact distribution matches marginally the observed events, we successfully predict their impact speed distribution.
Numerical methods in Markov chain modeling

NASA Technical Reports Server (NTRS)

Philippe, Bernard; Saad, Youcef; Stewart, William J.

1989-01-01

Several methods for computing stationary probability distributions of Markov chains are described and compared. The main linear algebra problem consists of computing an eigenvector of a sparse, usually nonsymmetric, matrix associated with a known eigenvalue. It can also be cast as a problem of solving a homogeneous singular linear system. Several methods based on combinations of Krylov subspace techniques are presented. The performance of these methods on some realistic problems are compared.
A comparison of Probability Of Detection (POD) data determined using different statistical methods

NASA Astrophysics Data System (ADS)

Fahr, A.; Forsyth, D.; Bullock, M.

1993-12-01

Different statistical methods have been suggested for determining probability of detection (POD) data for nondestructive inspection (NDI) techniques. A comparative assessment of various methods of determining POD was conducted using results of three NDI methods obtained by inspecting actual aircraft engine compressor disks which contained service induced cracks. The study found that the POD and 95 percent confidence curves as a function of crack size as well as the 90/95 percent crack length vary depending on the statistical method used and the type of data. The distribution function as well as the parameter estimation procedure used for determining POD and the confidence bound must be included when referencing information such as the 90/95 percent crack length. The POD curves and confidence bounds determined using the range interval method are very dependent on information that is not from the inspection data. The maximum likelihood estimators (MLE) method does not require such information and the POD results are more reasonable. The log-logistic function appears to model POD of hit/miss data relatively well and is easy to implement. The log-normal distribution using MLE provides more realistic POD results and is the preferred method. Although it is more complicated and slower to calculate, it can be implemented on a common spreadsheet program.
Carotid artery intima-media thickness measurement in children with normal and increased body mass index: a comparison of three techniques.

PubMed

El Jalbout, Ramy; Cloutier, Guy; Cardinal, Marie-Hélène Roy; Henderson, Mélanie; Lapierre, Chantale; Soulez, Gilles; Dubois, Josée

2018-05-09

Common carotid artery intima-media thickness is a marker of subclinical atherosclerosis. In children, increased intima-media thickness is associated with obesity and the risk of cardiovascular events in adulthood. To compare intima-media thickness measurements using B-mode ultrasound, radiofrequency (RF) echo tracking, and RF speckle probability distribution in children with normal and increased body mass index (BMI). We prospectively measured intima-media thickness in 120 children randomly selected from two groups of a longitudinal cohort: normal BMI and increased BMI, defined by BMI ≥85th percentile for age and gender. We followed Mannheim recommendations. We used M'Ath-Std for automated B-mode imaging, M-line processing of RF signal amplitude for RF echo tracking, and RF signal segmentation and averaging using probability distributions defining image speckle. Statistical analysis included Wilcoxon and Mann-Whitney tests, and Pearson correlation coefficient and intra-class correlation coefficient (ICC). Children were 10-13 years old (mean: 11.7 years); 61% were boys. The mean age was 11.4 years (range: 10.0-13.1 years) for the normal BMI group and 12.0 years (range: 10.1-13.5 years) for the increased BMI group. The normal BMI group included 58% boys and the increased BMI group 63% boys. RF echo tracking method was successful in 79 children as opposed to 114 for the B-mode method and all 120 for the probability distribution method. Techniques were weakly correlated: ICC=0.34 (95% confidence interval [CI]: 0.27-0.39). Intima-media thickness was significantly higher in the increased BMI than normal BMI group using the RF techniques and borderline for the B-mode technique. Mean differences between weight groups were: B-mode, 0.02 mm (95% CI: 0.00 to 0.04), P=0.05; RF echo tracking, 0.03 mm (95% CI: 0.01 to 0.05), P=0.01; and RF speckle probability distribution, 0.03 mm (95% CI: 0.01 to 0.05), P=0.002. Though techniques are not interchangeable, all showed increased intima-media thickness in children with increased BMI. RF echo tracking method had the lowest success rate at calculating intima-media thickness. For patient follow-up and cohort comparisons, the same technique should be used throughout.
The utility of Bayesian predictive probabilities for interim monitoring of clinical trials

PubMed Central

Connor, Jason T.; Ayers, Gregory D; Alvarez, JoAnn

2014-01-01

Background Bayesian predictive probabilities can be used for interim monitoring of clinical trials to estimate the probability of observing a statistically significant treatment effect if the trial were to continue to its predefined maximum sample size. Purpose We explore settings in which Bayesian predictive probabilities are advantageous for interim monitoring compared to Bayesian posterior probabilities, p-values, conditional power, or group sequential methods. Results For interim analyses that address prediction hypotheses, such as futility monitoring and efficacy monitoring with lagged outcomes, only predictive probabilities properly account for the amount of data remaining to be observed in a clinical trial and have the flexibility to incorporate additional information via auxiliary variables. Limitations Computational burdens limit the feasibility of predictive probabilities in many clinical trial settings. The specification of prior distributions brings additional challenges for regulatory approval. Conclusions The use of Bayesian predictive probabilities enables the choice of logical interim stopping rules that closely align with the clinical decision making process. PMID:24872363
Probabilistic modelling of overflow, surcharge and flooding in urban drainage using the first-order reliability method and parameterization of local rain series.

PubMed

Thorndahl, S; Willems, P

2008-01-01

Failure of urban drainage systems may occur due to surcharge or flooding at specific manholes in the system, or due to overflows from combined sewer systems to receiving waters. To quantify the probability or return period of failure, standard approaches make use of the simulation of design storms or long historical rainfall series in a hydrodynamic model of the urban drainage system. In this paper, an alternative probabilistic method is investigated: the first-order reliability method (FORM). To apply this method, a long rainfall time series was divided in rainstorms (rain events), and each rainstorm conceptualized to a synthetic rainfall hyetograph by a Gaussian shape with the parameters rainstorm depth, duration and peak intensity. Probability distributions were calibrated for these three parameters and used on the basis of the failure probability estimation, together with a hydrodynamic simulation model to determine the failure conditions for each set of parameters. The method takes into account the uncertainties involved in the rainstorm parameterization. Comparison is made between the failure probability results of the FORM method, the standard method using long-term simulations and alternative methods based on random sampling (Monte Carlo direct sampling and importance sampling). It is concluded that without crucial influence on the modelling accuracy, the FORM is very applicable as an alternative to traditional long-term simulations of urban drainage systems.
Finite element probabilistic risk assessment of transmission line insulation flashovers caused by lightning strokes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bacvarov, D.C.

1981-01-01

A new method for probabilistic risk assessment of transmission line insulation flashovers caused by lightning strokes is presented. The utilized approach of applying the finite element method for probabilistic risk assessment is demonstrated to be very powerful. The reasons for this are two. First, the finite element method is inherently suitable for analysis of three dimensional spaces where the parameters, such as three variate probability densities of the lightning currents, are non-uniformly distributed. Second, the finite element method permits non-uniform discretization of the three dimensional probability spaces thus yielding high accuracy in critical regions, such as the area of themore » low probability events, while at the same time maintaining coarse discretization in the non-critical areas to keep the number of grid points and the size of the problem to a manageable low level. The finite element probabilistic risk assessment method presented here is based on a new multidimensional search algorithm. It utilizes an efficient iterative technique for finite element interpolation of the transmission line insulation flashover criteria computed with an electro-magnetic transients program. Compared to other available methods the new finite element probabilistic risk assessment method is significantly more accurate and approximately two orders of magnitude computationally more efficient. The method is especially suited for accurate assessment of rare, very low probability events.« less
A review of contemporary methods for the presentation of scientific uncertainty.

PubMed

Makinson, K A; Hamby, D M; Edwards, J A

2012-12-01

Graphic methods for displaying uncertainty are often the most concise and informative way to communicate abstract concepts. Presentation methods currently in use for the display and interpretation of scientific uncertainty are reviewed. Numerous subjective and objective uncertainty display methods are presented, including qualitative assessments, node and arrow diagrams, standard statistical methods, box-and-whisker plots,robustness and opportunity functions, contribution indexes, probability density functions, cumulative distribution functions, and graphical likelihood functions.
Photon Counting Data Analysis: Application of the Maximum Likelihood and Related Methods for the Determination of Lifetimes in Mixtures of Rose Bengal and Rhodamine B

DOE PAGES

Santra, Kalyan; Smith, Emily A.; Petrich, Jacob W.; ...

2016-12-12

It is often convenient to know the minimum amount of data needed in order to obtain a result of desired accuracy and precision. It is a necessity in the case of subdiffraction-limited microscopies, such as stimulated emission depletion (STED) microscopy, owing to the limited sample volumes and the extreme sensitivity of the samples to photobleaching and photodamage. We present a detailed comparison of probability-based techniques (the maximum likelihood method and methods based on the binomial and the Poisson distributions) with residual minimization-based techniques for retrieving the fluorescence decay parameters for various two-fluorophore mixtures, as a function of the total numbermore » of photon counts, in time-correlated, single-photon counting experiments. The probability-based techniques proved to be the most robust (insensitive to initial values) in retrieving the target parameters and, in fact, performed equivalently to 2-3 significant figures. This is to be expected, as we demonstrate that the three methods are fundamentally related. Furthermore, methods based on the Poisson and binomial distributions have the desirable feature of providing a bin-by-bin analysis of a single fluorescence decay trace, which thus permits statistics to be acquired using only the one trace for not only the mean and median values of the fluorescence decay parameters but also for the associated standard deviations. Lastly, these probability-based methods lend themselves well to the analysis of the sparse data sets that are encountered in subdiffraction-limited microscopies.« less
An improved probabilistic approach for linking progenitor and descendant galaxy populations using comoving number density

NASA Astrophysics Data System (ADS)

Wellons, Sarah; Torrey, Paul

2017-06-01

Galaxy populations at different cosmic epochs are often linked by cumulative comoving number density in observational studies. Many theoretical works, however, have shown that the cumulative number densities of tracked galaxy populations not only evolve in bulk, but also spread out over time. We present a method for linking progenitor and descendant galaxy populations which takes both of these effects into account. We define probability distribution functions that capture the evolution and dispersion of galaxy populations in number density space, and use these functions to assign galaxies at redshift zf probabilities of being progenitors/descendants of a galaxy population at another redshift z0. These probabilities are used as weights for calculating distributions of physical progenitor/descendant properties such as stellar mass, star formation rate or velocity dispersion. We demonstrate that this probabilistic method provides more accurate predictions for the evolution of physical properties than the assumption of either a constant number density or an evolving number density in a bin of fixed width by comparing predictions against galaxy populations directly tracked through a cosmological simulation. We find that the constant number density method performs least well at recovering galaxy properties, the evolving method density slightly better and the probabilistic method best of all. The improvement is present for predictions of stellar mass as well as inferred quantities such as star formation rate and velocity dispersion. We demonstrate that this method can also be applied robustly and easily to observational data, and provide a code package for doing so.

Using the weighted area under the net benefit curve for decision curve analysis.

PubMed

Talluri, Rajesh; Shete, Sanjay

2016-07-18

Risk prediction models have been proposed for various diseases and are being improved as new predictors are identified. A major challenge is to determine whether the newly discovered predictors improve risk prediction. Decision curve analysis has been proposed as an alternative to the area under the curve and net reclassification index to evaluate the performance of prediction models in clinical scenarios. The decision curve computed using the net benefit can evaluate the predictive performance of risk models at a given or range of threshold probabilities. However, when the decision curves for 2 competing models cross in the range of interest, it is difficult to identify the best model as there is no readily available summary measure for evaluating the predictive performance. The key deterrent for using simple measures such as the area under the net benefit curve is the assumption that the threshold probabilities are uniformly distributed among patients. We propose a novel measure for performing decision curve analysis. The approach estimates the distribution of threshold probabilities without the need of additional data. Using the estimated distribution of threshold probabilities, the weighted area under the net benefit curve serves as the summary measure to compare risk prediction models in a range of interest. We compared 3 different approaches, the standard method, the area under the net benefit curve, and the weighted area under the net benefit curve. Type 1 error and power comparisons demonstrate that the weighted area under the net benefit curve has higher power compared to the other methods. Several simulation studies are presented to demonstrate the improvement in model comparison using the weighted area under the net benefit curve compared to the standard method. The proposed measure improves decision curve analysis by using the weighted area under the curve and thereby improves the power of the decision curve analysis to compare risk prediction models in a clinical scenario.
Universal noise and Efimov physics

NASA Astrophysics Data System (ADS)

Nicholson, Amy N.

2016-03-01

Probability distributions for correlation functions of particles interacting via random-valued fields are discussed as a novel tool for determining the spectrum of a theory. In particular, this method is used to determine the energies of universal N-body clusters tied to Efimov trimers, for even N, by investigating the distribution of a correlation function of two particles at unitarity. Using numerical evidence that this distribution is log-normal, an analytical prediction for the N-dependence of the N-body binding energies is made.
Using Patterns of Summed Scores in Paper-and-Pencil Tests and Computer-Adaptive Tests to Detect Misfitting Item Score Patterns

ERIC Educational Resources Information Center

Meijer, Rob R.

2004-01-01

Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…
Evaluation of a Class of Simple and Effective Uncertainty Methods for Sparse Samples of Random Variables and Functions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin

When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less
Statistical description of non-Gaussian samples in the F2 layer of the ionosphere during heliogeophysical disturbances

NASA Astrophysics Data System (ADS)

Sergeenko, N. P.

2017-11-01

An adequate statistical method should be developed in order to predict probabilistically the range of ionospheric parameters. This problem is solved in this paper. The time series of the critical frequency of the layer F2- foF2( t) were subjected to statistical processing. For the obtained samples {δ foF2}, statistical distributions and invariants up to the fourth order are calculated. The analysis shows that the distributions differ from the Gaussian law during the disturbances. At levels of sufficiently small probability distributions, there are arbitrarily large deviations from the model of the normal process. Therefore, it is attempted to describe statistical samples {δ foF2} based on the Poisson model. For the studied samples, the exponential characteristic function is selected under the assumption that time series are a superposition of some deterministic and random processes. Using the Fourier transform, the characteristic function is transformed into a nonholomorphic excessive-asymmetric probability-density function. The statistical distributions of the samples {δ foF2} calculated for the disturbed periods are compared with the obtained model distribution function. According to the Kolmogorov's criterion, the probabilities of the coincidence of a posteriori distributions with the theoretical ones are P 0.7-0.9. The conducted analysis makes it possible to draw a conclusion about the applicability of a model based on the Poisson random process for the statistical description and probabilistic variation estimates during heliogeophysical disturbances of the variations {δ foF2}.
Applications of physics to economics and finance: Money, income, wealth, and the stock market

NASA Astrophysics Data System (ADS)

Dragulescu, Adrian Antoniu

Several problems arising in Economics and Finance are analyzed using concepts and quantitative methods from Physics. The dissertation is organized as follows: In the first chapter it is argued that in a closed economic system, money is conserved. Thus, by analogy with energy, the equilibrium probability distribution of money must follow the exponential Boltzmann-Gibbs law characterized by an effective temperature equal to the average amount of money per economic agent. The emergence of Boltzmann-Gibbs distribution is demonstrated through computer simulations of economic models. A thermal machine which extracts a monetary profit can be constructed between two economic systems with different temperatures. The role of debt and models with broken time-reversal symmetry for which the Boltzmann-Gibbs law does not hold, are discussed. In the second chapter, using data from several sources, it is found that the distribution of income is described for the great majority of population by an exponential distribution, whereas the high-end tail follows a power law. From the individual income distribution, the probability distribution of income for families with two earners is derived and it is shown that it also agrees well with the data. Data on wealth is presented and it is found that the distribution of wealth has a structure similar to the distribution of income. The Lorenz curve and Gini coefficient were calculated and are shown to be in good agreement with both income and wealth data sets. In the third chapter, the stock-market fluctuations at different time scales are investigated. A model where stock-price dynamics is governed by a geometrical (multiplicative) Brownian motion with stochastic variance is proposed. The corresponding Fokker-Planck equation can be solved exactly. Integrating out the variance, an analytic formula for the time-dependent probability distribution of stock price changes (returns) is found. The formula is in excellent agreement with the Dow-Jones index for the time lags from 1 to 250 trading days. For time lags longer than the relaxation time of variance, the probability distribution can be expressed in a scaling form using a Bessel function. The Dow-Jones data follow the scaling function for seven orders of magnitude.
A Mathematical Modelling Approach to One-Day Cricket Batting Orders

PubMed Central

Bukiet, Bruce; Ovens, Matthews

2006-01-01

While scoring strategies and player performance in cricket have been studied, there has been little published work about the influence of batting order with respect to One-Day cricket. We apply a mathematical modelling approach to compute efficiently the expected performance (runs distribution) of a cricket batting order in an innings. Among other applications, our method enables one to solve for the probability of one team beating another or to find the optimal batting order for a set of 11 players. The influence of defence and bowling ability can be taken into account in a straightforward manner. In this presentation, we outline how we develop our Markov Chain approach to studying the progress of runs for a batting order of non- identical players along the lines of work in baseball modelling by Bukiet et al., 1997. We describe the issues that arise in applying such methods to cricket, discuss ideas for addressing these difficulties and note limitations on modelling batting order for One-Day cricket. By performing our analysis on a selected subset of the possible batting orders, we apply the model to quantify the influence of batting order in a game of One Day cricket using available real-world data for current players. Key Points Batting order does effect the expected runs distribution in one-day cricket. One-day cricket has fewer data points than baseball, thus extreme values have greater effect on estimated probabilities. Dismissals rare and probabilities very small by comparison to baseball. Probability distribution for lower order batsmen is potentially skewed due to increased risk taking. Full enumeration of all possible line-ups is impractical using a single average computer. PMID:24357943
A mathematical modelling approach to one-day cricket batting orders.

PubMed

Bukiet, Bruce; Ovens, Matthews

2006-01-01

While scoring strategies and player performance in cricket have been studied, there has been little published work about the influence of batting order with respect to One-Day cricket. We apply a mathematical modelling approach to compute efficiently the expected performance (runs distribution) of a cricket batting order in an innings. Among other applications, our method enables one to solve for the probability of one team beating another or to find the optimal batting order for a set of 11 players. The influence of defence and bowling ability can be taken into account in a straightforward manner. In this presentation, we outline how we develop our Markov Chain approach to studying the progress of runs for a batting order of non- identical players along the lines of work in baseball modelling by Bukiet et al., 1997. We describe the issues that arise in applying such methods to cricket, discuss ideas for addressing these difficulties and note limitations on modelling batting order for One-Day cricket. By performing our analysis on a selected subset of the possible batting orders, we apply the model to quantify the influence of batting order in a game of One Day cricket using available real-world data for current players. Key PointsBatting order does effect the expected runs distribution in one-day cricket.One-day cricket has fewer data points than baseball, thus extreme values have greater effect on estimated probabilities.Dismissals rare and probabilities very small by comparison to baseball.Probability distribution for lower order batsmen is potentially skewed due to increased risk taking.Full enumeration of all possible line-ups is impractical using a single average computer.
Exceedance probability map: a tool helping the definition of arsenic Natural Background Level (NBL) within the Drainage Basin to the Venice Lagoon (NE Italy)

NASA Astrophysics Data System (ADS)

Dalla Libera, Nico; Fabbri, Paolo; Mason, Leonardo; Piccinini, Leonardo; Pola, Marco

2017-04-01

Arsenic groundwater contamination affects worldwide shallower groundwater bodies. Starting from the actual knowledges around arsenic origin into groundwater, we know that the major part of dissolved arsenic is naturally occurring through the dissolution of As-bearing minerals and ores. Several studies on the shallow aquifers of both the regional Venetian Plain (NE Italy) and the local Drainage Basin to the Venice Lagoon (DBVL) show local high arsenic concentration related to peculiar geochemical conditions, which drive arsenic mobilization. The uncertainty of arsenic spatial distribution makes difficult both the evaluation of the processes involved in arsenic mobilization and the stakeholders' decision about environmental management. Considering the latter aspect, the present study treats the problem of the Natural Background Level (NBL) definition as the threshold discriminating the natural contamination from the anthropogenic pollution. Actually, the UE's Directive 2006/118/EC suggests the procedures and criteria to set up the water quality standards guaranteeing a healthy status and reversing any contamination trends. In addition, the UE's BRIDGE project proposes some criteria, based on the 90th percentile of the contaminant's concentrations dataset, to estimate the NBL. Nevertheless, these methods provides just a statistical NBL for the whole area without considering the spatial variation of the contaminant's concentration. In this sense, we would reinforce the NBL concept using a geostatistical approach, which is able to give some detailed information about the distribution of arsenic concentrations and unveiling zones with high concentrations referred to the Italian drinking water standard (IDWS = 10 µg/liter). Once obtained the spatial information about arsenic distribution, we can apply the 90th percentile methods to estimate some Local NBL referring to every zones with arsenic higher than IDWS. The indicator kriging method was considered because it estimates the spatial distribution of the exceedance probabilities respect some pre-defined thresholds. This approach is largely mentioned in literature to face similar environmental problems. To test the validity of the procedure, we used the dataset from "A.Li.Na" project (founded by the Regional Environmental Agency) that defined regional NBLs of As, Fe, Mn and NH4+ into DBVL's groundwater. Primarily, we defined two thresholds corresponding respectively to the IDWS and the median of the data over the IDWS. These values were decided basing on the dataset's statistical structure and the quality criteria of the GWD 2006/118/EC. Subsequently, we evaluated the spatial distribution of the probability to exceed the defined thresholds using the Indicator kriging. The results highlight different zones with high exceedance probability ranging from 75% to 95% respect both the IDWS and the median value. Considering the geological setting of the DBVL, these probability values correspond with the occurrence of both organic matter and reducing conditions. In conclusion, the spatial prediction of the exceedance probability could be useful to define the areas in which estimate the local NBLs, enhancing the procedure of NBL definition. In that way, the NBL estimation could be more realistic because it considers the spatial distribution of the studied contaminant, distinguishing areas with high natural concentrations from polluted ones.
A Method to Estimate the Probability that any Individual Cloud-to-Ground Lightning Stroke was Within any Radius of any Point

NASA Technical Reports Server (NTRS)

Huddleston, Lisa L.; Roeder, William P.; Merceret, Francis J.

2011-01-01

A new technique has been developed to estimate the probability that a nearby cloud to ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even with the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force Station. Future applications could include forensic meteorology.
A Method to Estimate the Probability that Any Individual Cloud-to-Ground Lightning Stroke was Within Any Radius of Any Point

NASA Technical Reports Server (NTRS)

Huddleston, Lisa; Roeder, WIlliam P.; Merceret, Francis J.

2011-01-01

A new technique has been developed to estimate the probability that a nearby cloud-to-ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even within the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force station. Future applications could include forensic meteorology.
Multi-scale occupancy estimation and modelling using multiple detection methods

USGS Publications Warehouse

Nichols, James D.; Bailey, Larissa L.; O'Connell, Allan F.; Talancy, Neil W.; Grant, Evan H. Campbell; Gilbert, Andrew T.; Annand, Elizabeth M.; Husband, Thomas P.; Hines, James E.

2008-01-01

Occupancy estimation and modelling based on detection–nondetection data provide an effective way of exploring change in a species’ distribution across time and space in cases where the species is not always detected with certainty. Today, many monitoring programmes target multiple species, or life stages within a species, requiring the use of multiple detection methods. When multiple methods or devices are used at the same sample sites, animals can be detected by more than one method.We develop occupancy models for multiple detection methods that permit simultaneous use of data from all methods for inference about method-specific detection probabilities. Moreover, the approach permits estimation of occupancy at two spatial scales: the larger scale corresponds to species’ use of a sample unit, whereas the smaller scale corresponds to presence of the species at the local sample station or site.We apply the models to data collected on two different vertebrate species: striped skunks Mephitis mephitis and red salamanders Pseudotriton ruber. For striped skunks, large-scale occupancy estimates were consistent between two sampling seasons. Small-scale occupancy probabilities were slightly lower in the late winter/spring when skunks tend to conserve energy, and movements are limited to males in search of females for breeding. There was strong evidence of method-specific detection probabilities for skunks. As anticipated, large- and small-scale occupancy areas completely overlapped for red salamanders. The analyses provided weak evidence of method-specific detection probabilities for this species.Synthesis and applications. Increasingly, many studies are utilizing multiple detection methods at sampling locations. The modelling approach presented here makes efficient use of detections from multiple methods to estimate occupancy probabilities at two spatial scales and to compare detection probabilities associated with different detection methods. The models can be viewed as another variation of Pollock's robust design and may be applicable to a wide variety of scenarios where species occur in an area but are not always near the sampled locations. The estimation approach is likely to be especially useful in multispecies conservation programmes by providing efficient estimates using multiple detection devices and by providing device-specific detection probability estimates for use in survey design.
Nonadditive entropies yield probability distributions with biases not warranted by the data.

PubMed

Pressé, Steve; Ghosh, Kingshuk; Lee, Julian; Dill, Ken A

2013-11-01

Different quantities that go by the name of entropy are used in variational principles to infer probability distributions from limited data. Shore and Johnson showed that maximizing the Boltzmann-Gibbs form of the entropy ensures that probability distributions inferred satisfy the multiplication rule of probability for independent events in the absence of data coupling such events. Other types of entropies that violate the Shore and Johnson axioms, including nonadditive entropies such as the Tsallis entropy, violate this basic consistency requirement. Here we use the axiomatic framework of Shore and Johnson to show how such nonadditive entropy functions generate biases in probability distributions that are not warranted by the underlying data.
Estimated Accuracy of Three Common Trajectory Statistical Methods

NASA Technical Reports Server (NTRS)

Kabashnikov, Vitaliy P.; Chaikovsky, Anatoli P.; Kucsera, Tom L.; Metelskaya, Natalia S.

2011-01-01

Three well-known trajectory statistical methods (TSMs), namely concentration field (CF), concentration weighted trajectory (CWT), and potential source contribution function (PSCF) methods were tested using known sources and artificially generated data sets to determine the ability of TSMs to reproduce spatial distribution of the sources. In the works by other authors, the accuracy of the trajectory statistical methods was estimated for particular species and at specified receptor locations. We have obtained a more general statistical estimation of the accuracy of source reconstruction and have found optimum conditions to reconstruct source distributions of atmospheric trace substances. Only virtual pollutants of the primary type were considered. In real world experiments, TSMs are intended for application to a priori unknown sources. Therefore, the accuracy of TSMs has to be tested with all possible spatial distributions of sources. An ensemble of geographical distributions of virtual sources was generated. Spearman s rank order correlation coefficient between spatial distributions of the known virtual and the reconstructed sources was taken to be a quantitative measure of the accuracy. Statistical estimates of the mean correlation coefficient and a range of the most probable values of correlation coefficients were obtained. All the TSMs that were considered here showed similar close results. The maximum of the ratio of the mean correlation to the width of the correlation interval containing the most probable correlation values determines the optimum conditions for reconstruction. An optimal geographical domain roughly coincides with the area supplying most of the substance to the receptor. The optimal domain s size is dependent on the substance decay time. Under optimum reconstruction conditions, the mean correlation coefficients can reach 0.70 0.75. The boundaries of the interval with the most probable correlation values are 0.6 0.9 for the decay time of 240 h and 0.5 0.95 for the decay time of 12 h. The best results of source reconstruction can be expected for the trace substances with a decay time on the order of several days. Although the methods considered in this paper do not guarantee high accuracy they are computationally simple and fast. Using the TSMs in optimum conditions and taking into account the range of uncertainties, one can obtain a first hint on potential source areas.
Characterizing the distribution of an endangered salmonid using environmental DNA analysis

USGS Publications Warehouse

Laramie, Matthew B.; Pilliod, David S.; Goldberg, Caren S.

2015-01-01

Determining species distributions accurately is crucial to developing conservation and management strategies for imperiled species, but a challenging task for small populations. We evaluated the efficacy of environmental DNA (eDNA) analysis for improving detection and thus potentially refining the known distribution of Chinook salmon (Oncorhynchus tshawytscha) in the Methow and Okanogan Subbasins of the Upper Columbia River, which span the border between Washington, USA and British Columbia, Canada. We developed an assay to target a 90 base pair sequence of Chinook DNA and used quantitative polymerase chain reaction (qPCR) to quantify the amount of Chinook eDNA in triplicate 1-L water samples collected at 48 stream locations in June and again in August 2012. The overall probability of detecting Chinook with our eDNA method in areas within the known distribution was 0.77 (±0.05 SE). Detection probability was lower in June (0.62, ±0.08 SE) during high flows and at the beginning of spring Chinook migration than during base flows in August (0.93, ±0.04 SE). In the Methow subbasin, mean eDNA concentration was higher in August compared to June, especially in smaller tributaries, probably resulting from the arrival of spring Chinook adults, reduced discharge, or both. Chinook eDNA concentrations did not appear to change in the Okanogan subbasin from June to August. Contrary to our expectations about downstream eDNA accumulation, Chinook eDNA did not decrease in concentration in upstream reaches (0–120 km). Further examination of factors influencing spatial distribution of eDNA in lotic systems may allow for greater inference of local population densities along stream networks or watersheds. These results demonstrate the potential effectiveness of eDNA detection methods for determining landscape-level distribution of anadromous salmonids in large river systems.
A distribution method for analysing the baseline of pulsatile endocrine signals as exemplified by 24-hour growth hormone profiles.

PubMed

Matthews, D R; Hindmarsh, P C; Pringle, P J; Brook, C G

1991-09-01

To develop a method for quantifying the distribution of concentrations present in hormone profiles, which would allow an observer-unbiased estimate of the time concentration attribute and to make an assessment of the baseline. The log-transformed concentrations (regardless of their temporal attribute) are sorted and allocated to class intervals. The number of observations in each interval are then determined and expressed as a percentage of the total number of samples drawn in the study period. The data may be displayed as a frequency distribution or as a cumulative distribution. Cumulative distributions may be plotted as sigmoidal ogives or can be transformed into discrete probabilities (linear probits), which are then linear, and amenable to regression analysis. Probability analysis gives estimates of the mean (the value below which 50% of the observed concentrations lie, which we term 'OC50'). 'Baseline' can be defined in terms of percentage occupancy--the 'Observed Concentration for 5%' (which we term 'OC5') which is the threshold at or below which the hormone concentrations are measured 5% of the time. We report the use of applying this method to 24-hour growth hormone (GH) profiles from 63 children, 26 adults and one giant. We demonstrate that GH effects (growth or gigantism) in these groups are more related to the baseline OC5 concentration than peak concentration (OC5 +/- 95% confidence limits: adults 0.05 +/- 0.04, peak-height-velocity pubertal 0.39 +/- 0.22, giant 8.9 mU/l). Pulsatile hormone profiles can be analysed using this method in order to assess baseline and other concentration domains.
ProbOnto: ontology and knowledge base of probability distributions.

PubMed

Swat, Maciej J; Grenon, Pierre; Wimalaratne, Sarala

2016-09-01

Probability distributions play a central role in mathematical and statistical modelling. The encoding, annotation and exchange of such models could be greatly simplified by a resource providing a common reference for the definition of probability distributions. Although some resources exist, no suitably detailed and complex ontology exists nor any database allowing programmatic access. ProbOnto, is an ontology-based knowledge base of probability distributions, featuring more than 80 uni- and multivariate distributions with their defining functions, characteristics, relationships and re-parameterization formulas. It can be used for model annotation and facilitates the encoding of distribution-based models, related functions and quantities. http://probonto.org mjswat@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
The discrete Laplace exponential family and estimation of Y-STR haplotype frequencies.

PubMed

Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

2013-07-21

Estimating haplotype frequencies is important in e.g. forensic genetics, where the frequencies are needed to calculate the likelihood ratio for the evidential weight of a DNA profile found at a crime scene. Estimation is naturally based on a population model, motivating the investigation of the Fisher-Wright model of evolution for haploid lineage DNA markers. An exponential family (a class of probability distributions that is well understood in probability theory such that inference is easily made by using existing software) called the 'discrete Laplace distribution' is described. We illustrate how well the discrete Laplace distribution approximates a more complicated distribution that arises by investigating the well-known population genetic Fisher-Wright model of evolution by a single-step mutation process. It was shown how the discrete Laplace distribution can be used to estimate haplotype frequencies for haploid lineage DNA markers (such as Y-chromosomal short tandem repeats), which in turn can be used to assess the evidential weight of a DNA profile found at a crime scene. This was done by making inference in a mixture of multivariate, marginally independent, discrete Laplace distributions using the EM algorithm to estimate the probabilities of membership of a set of unobserved subpopulations. The discrete Laplace distribution can be used to estimate haplotype frequencies with lower prediction error than other existing estimators. Furthermore, the calculations could be performed on a normal computer. This method was implemented in the freely available open source software R that is supported on Linux, MacOS and MS Windows. Copyright © 2013 Elsevier Ltd. All rights reserved.
NEWTPOIS- NEWTON POISSON DISTRIBUTION PROGRAM

NASA Technical Reports Server (NTRS)

Bowerman, P. N.

1994-01-01

The cumulative poisson distribution program, NEWTPOIS, is one of two programs which make calculations involving cumulative poisson distributions. Both programs, NEWTPOIS (NPO-17715) and CUMPOIS (NPO-17714), can be used independently of one another. NEWTPOIS determines percentiles for gamma distributions with integer shape parameters and calculates percentiles for chi-square distributions with even degrees of freedom. It can be used by statisticians and others concerned with probabilities of independent events occurring over specific units of time, area, or volume. NEWTPOIS determines the Poisson parameter (lambda), that is; the mean (or expected) number of events occurring in a given unit of time, area, or space. Given that the user already knows the cumulative probability for a specific number of occurrences (n) it is usually a simple matter of substitution into the Poisson distribution summation to arrive at lambda. However, direct calculation of the Poisson parameter becomes difficult for small positive values of n and unmanageable for large values. NEWTPOIS uses Newton's iteration method to extract lambda from the initial value condition of the Poisson distribution where n=0, taking successive estimations until some user specified error term (epsilon) is reached. The NEWTPOIS program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly on most C compilers. The program format is interactive, accepting epsilon, n, and the cumulative probability of the occurrence of n as inputs. It has been implemented under DOS 3.2 and has a memory requirement of 30K. NEWTPOIS was developed in 1988.
Time difference of arrival estimation of microseismic signals based on alpha-stable distribution

NASA Astrophysics Data System (ADS)

Jia, Rui-Sheng; Gong, Yue; Peng, Yan-Jun; Sun, Hong-Mei; Zhang, Xing-Li; Lu, Xin-Ming

2018-05-01

Microseismic signals are generally considered to follow the Gauss distribution. A comparison of the dynamic characteristics of sample variance and the symmetry of microseismic signals with the signals which follow α-stable distribution reveals that the microseismic signals have obvious pulse characteristics and that the probability density curve of the microseismic signal is approximately symmetric. Thus, the hypothesis that microseismic signals follow the symmetric α-stable distribution is proposed. On the premise of this hypothesis, the characteristic exponent α of the microseismic signals is obtained by utilizing the fractional low-order statistics, and then a new method of time difference of arrival (TDOA) estimation of microseismic signals based on fractional low-order covariance (FLOC) is proposed. Upon applying this method to the TDOA estimation of Ricker wavelet simulation signals and real microseismic signals, experimental results show that the FLOC method, which is based on the assumption of the symmetric α-stable distribution, leads to enhanced spatial resolution of the TDOA estimation relative to the generalized cross correlation (GCC) method, which is based on the assumption of the Gaussian distribution.

A Bayesian inversion for slip distribution of 1 Apr 2007 Mw8.1 Solomon Islands Earthquake

NASA Astrophysics Data System (ADS)

Chen, T.; Luo, H.

2013-12-01

On 1 Apr 2007 the megathrust Mw8.1 Solomon Islands earthquake occurred in the southeast pacific along the New Britain subduction zone. 102 vertical displacement measurements over the southeastern end of the rupture zone from two field surveys after this event provide a unique constraint for slip distribution inversion. In conventional inversion method (such as bounded variable least squares) the smoothing parameter that determines the relative weight placed on fitting the data versus smoothing the slip distribution is often subjectively selected at the bend of the trade-off curve. Here a fully probabilistic inversion method[Fukuda,2008] is applied to estimate distributed slip and smoothing parameter objectively. The joint posterior probability density function of distributed slip and the smoothing parameter is formulated under a Bayesian framework and sampled with Markov chain Monte Carlo method. We estimate the spatial distribution of dip slip associated with the 1 Apr 2007 Solomon Islands earthquake with this method. Early results show a shallower dip angle than previous study and highly variable dip slip both along-strike and down-dip.
Method for removing atomic-model bias in macromolecular crystallography

DOEpatents

Terwilliger, Thomas C [Santa Fe, NM

2006-08-01

Structure factor bias in an electron density map for an unknown crystallographic structure is minimized by using information in a first electron density map to elicit expected structure factor information. Observed structure factor amplitudes are combined with a starting set of crystallographic phases to form a first set of structure factors. A first electron density map is then derived and features of the first electron density map are identified to obtain expected distributions of electron density. Crystallographic phase probability distributions are established for possible crystallographic phases of reflection k, and the process is repeated as k is indexed through all of the plurality of reflections. An updated electron density map is derived from the crystallographic phase probability distributions for each one of the reflections. The entire process is then iterated to obtain a final set of crystallographic phases with minimum bias from known electron density maps.
The bingo model of survivorship: 1. probabilistic aspects.

PubMed

Murphy, E A; Trojak, J E; Hou, W; Rohde, C A

1981-01-01

A "bingo" model is one in which the pattern of survival of a system is determined by whichever of several components, each with its own particular distribution for survival, fails first. The model is motivated by the study of lifespan in animals. A number of properties of such systems are discussed in general. They include the use of a special criterion of skewness that probably corresponds more closely than traditional measures to what the eye observes in casually inspecting data. This criterion is the ratio, r(h), of the probability density at a point an arbitrary distance, h, above the mode to that an equal distance below the mode. If this ratio is positive for all positive arguments, the distribution is considered positively asymmetrical and conversely. Details of the bingo model are worked out for several types of base distributions: the rectangular, the triangular, the logistic, and by numerical methods, the normal, lognormal, and gamma.
PRODIGEN: visualizing the probability landscape of stochastic gene regulatory networks in state and time space.

PubMed

Ma, Chihua; Luciani, Timothy; Terebus, Anna; Liang, Jie; Marai, G Elisabeta

2017-02-15

Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists' understanding of phenotypic behavior associated with specific genes. We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases.
Aerosol-type retrieval and uncertainty quantification from OMI data

NASA Astrophysics Data System (ADS)

Kauppi, Anu; Kolmonen, Pekka; Laine, Marko; Tamminen, Johanna

2017-11-01

We discuss uncertainty quantification for aerosol-type selection in satellite-based atmospheric aerosol retrieval. The retrieval procedure uses precalculated aerosol microphysical models stored in look-up tables (LUTs) and top-of-atmosphere (TOA) spectral reflectance measurements to solve the aerosol characteristics. The forward model approximations cause systematic differences between the modelled and observed reflectance. Acknowledging this model discrepancy as a source of uncertainty allows us to produce more realistic uncertainty estimates and assists the selection of the most appropriate LUTs for each individual retrieval.This paper focuses on the aerosol microphysical model selection and characterisation of uncertainty in the retrieved aerosol type and aerosol optical depth (AOD). The concept of model evidence is used as a tool for model comparison. The method is based on Bayesian inference approach, in which all uncertainties are described as a posterior probability distribution. When there is no single best-matching aerosol microphysical model, we use a statistical technique based on Bayesian model averaging to combine AOD posterior probability densities of the best-fitting models to obtain an averaged AOD estimate. We also determine the shared evidence of the best-matching models of a certain main aerosol type in order to quantify how plausible it is that it represents the underlying atmospheric aerosol conditions.The developed method is applied to Ozone Monitoring Instrument (OMI) measurements using a multiwavelength approach for retrieving the aerosol type and AOD estimate with uncertainty quantification for cloud-free over-land pixels. Several larger pixel set areas were studied in order to investigate the robustness of the developed method. We evaluated the retrieved AOD by comparison with ground-based measurements at example sites. We found that the uncertainty of AOD expressed by posterior probability distribution reflects the difficulty in model selection. The posterior probability distribution can provide a comprehensive characterisation of the uncertainty in this kind of problem for aerosol-type selection. As a result, the proposed method can account for the model error and also include the model selection uncertainty in the total uncertainty budget.
Unsupervised, low latency anomaly detection of algorithmically generated domain names by generative probabilistic modeling.

PubMed

Raghuram, Jayaram; Miller, David J; Kesidis, George

2014-07-01

We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates.
Unsupervised, low latency anomaly detection of algorithmically generated domain names by generative probabilistic modeling

PubMed Central

Raghuram, Jayaram; Miller, David J.; Kesidis, George

2014-01-01

We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates. PMID:25685511
Distributed Synchronization in Networks of Agent Systems With Nonlinearities and Random Switchings.

PubMed

Tang, Yang; Gao, Huijun; Zou, Wei; Kurths, Jürgen

2013-02-01

In this paper, the distributed synchronization problem of networks of agent systems with controllers and nonlinearities subject to Bernoulli switchings is investigated. Controllers and adaptive updating laws injected in each vertex of networks depend on the state information of its neighborhood. Three sets of Bernoulli stochastic variables are introduced to describe the occurrence probabilities of distributed adaptive controllers, updating laws and nonlinearities, respectively. By the Lyapunov functions method, we show that the distributed synchronization of networks composed of agent systems with multiple randomly occurring nonlinearities, multiple randomly occurring controllers, and multiple randomly occurring updating laws can be achieved in mean square under certain criteria. The conditions derived in this paper can be solved by semi-definite programming. Moreover, by mathematical analysis, we find that the coupling strength, the probabilities of the Bernoulli stochastic variables, and the form of nonlinearities have great impacts on the convergence speed and the terminal control strength. The synchronization criteria and the observed phenomena are demonstrated by several numerical simulation examples. In addition, the advantage of distributed adaptive controllers over conventional adaptive controllers is illustrated.
Birth/birth-death processes and their computable transition probabilities with biological applications.

PubMed

Ho, Lam Si Tung; Xu, Jason; Crawford, Forrest W; Minin, Vladimir N; Suchard, Marc A

2018-03-01

Birth-death processes track the size of a univariate population, but many biological systems involve interaction between populations, necessitating models for two or more populations simultaneously. A lack of efficient methods for evaluating finite-time transition probabilities of bivariate processes, however, has restricted statistical inference in these models. Researchers rely on computationally expensive methods such as matrix exponentiation or Monte Carlo approximation, restricting likelihood-based inference to small systems, or indirect methods such as approximate Bayesian computation. In this paper, we introduce the birth/birth-death process, a tractable bivariate extension of the birth-death process, where rates are allowed to be nonlinear. We develop an efficient algorithm to calculate its transition probabilities using a continued fraction representation of their Laplace transforms. Next, we identify several exemplary models arising in molecular epidemiology, macro-parasite evolution, and infectious disease modeling that fall within this class, and demonstrate advantages of our proposed method over existing approaches to inference in these models. Notably, the ubiquitous stochastic susceptible-infectious-removed (SIR) model falls within this class, and we emphasize that computable transition probabilities newly enable direct inference of parameters in the SIR model. We also propose a very fast method for approximating the transition probabilities under the SIR model via a novel branching process simplification, and compare it to the continued fraction representation method with application to the 17th century plague in Eyam. Although the two methods produce similar maximum a posteriori estimates, the branching process approximation fails to capture the correlation structure in the joint posterior distribution.
Estimating site occupancy and detection probability parameters for meso- and large mammals in a coastal eosystem

USGS Publications Warehouse

O'Connell, Allan F.; Talancy, Neil W.; Bailey, Larissa L.; Sauer, John R.; Cook, Robert; Gilbert, Andrew T.

2006-01-01

Large-scale, multispecies monitoring programs are widely used to assess changes in wildlife populations but they often assume constant detectability when documenting species occurrence. This assumption is rarely met in practice because animal populations vary across time and space. As a result, detectability of a species can be influenced by a number of physical, biological, or anthropogenic factors (e.g., weather, seasonality, topography, biological rhythms, sampling methods). To evaluate some of these influences, we estimated site occupancy rates using species-specific detection probabilities for meso- and large terrestrial mammal species on Cape Cod, Massachusetts, USA. We used model selection to assess the influence of different sampling methods and major environmental factors on our ability to detect individual species. Remote cameras detected the most species (9), followed by cubby boxes (7) and hair traps (4) over a 13-month period. Estimated site occupancy rates were similar among sampling methods for most species when detection probabilities exceeded 0.15, but we question estimates obtained from methods with detection probabilities between 0.05 and 0.15, and we consider methods with lower probabilities unacceptable for occupancy estimation and inference. Estimated detection probabilities can be used to accommodate variation in sampling methods, which allows for comparison of monitoring programs using different protocols. Vegetation and seasonality produced species-specific differences in detectability and occupancy, but differences were not consistent within or among species, which suggests that our results should be considered in the context of local habitat features and life history traits for the target species. We believe that site occupancy is a useful state variable and suggest that monitoring programs for mammals using occupancy data consider detectability prior to making inferences about species distributions or population change.
Electron emission produced by photointeractions in a slab target

NASA Technical Reports Server (NTRS)

Thinger, B. E.; Dayton, J. A., Jr.

1973-01-01

The current density and energy spectrum of escaping electrons generated in a uniform plane slab target which is being irradiated by the gamma flux field of a nuclear reactor are calculated by using experimental gamma energy transfer coefficients, electron range and energy relations, and escape probability computations. The probability of escape and the average path length of escaping electrons are derived for an isotropic distribution of monoenergetic photons. The method of estimating the flux and energy distribution of electrons emerging from the surface is outlined, and a sample calculation is made for a 0.33-cm-thick tungsten target located next to the core of a nuclear reactor. The results are to be used as a guide in electron beam synthesis of reactor experiments.
Safety assessment of a shallow foundation using the random finite element method

NASA Astrophysics Data System (ADS)

Zaskórski, Łukasz; Puła, Wojciech

2015-04-01

A complex structure of soil and its random character are reasons why soil modeling is a cumbersome task. Heterogeneity of soil has to be considered even within a homogenous layer of soil. Therefore an estimation of shear strength parameters of soil for the purposes of a geotechnical analysis causes many problems. In applicable standards (Eurocode 7) there is not presented any explicit method of an evaluation of characteristic values of soil parameters. Only general guidelines can be found how these values should be estimated. Hence many approaches of an assessment of characteristic values of soil parameters are presented in literature and can be applied in practice. In this paper, the reliability assessment of a shallow strip footing was conducted using a reliability index β. Therefore some approaches of an estimation of characteristic values of soil properties were compared by evaluating values of reliability index β which can be achieved by applying each of them. Method of Orr and Breysse, Duncan's method, Schneider's method, Schneider's method concerning influence of fluctuation scales and method included in Eurocode 7 were examined. Design values of the bearing capacity based on these approaches were referred to the stochastic bearing capacity estimated by the random finite element method (RFEM). Design values of the bearing capacity were conducted for various widths and depths of a foundation in conjunction with design approaches DA defined in Eurocode. RFEM was presented by Griffiths and Fenton (1993). It combines deterministic finite element method, random field theory and Monte Carlo simulations. Random field theory allows to consider a random character of soil parameters within a homogenous layer of soil. For this purpose a soil property is considered as a separate random variable in every element of a mesh in the finite element method with proper correlation structure between points of given area. RFEM was applied to estimate which theoretical probability distribution fits the empirical probability distribution of bearing capacity basing on 3000 realizations. Assessed probability distribution was applied to compute design values of the bearing capacity and related reliability indices β. Conducted analysis were carried out for a cohesion soil. Hence a friction angle and a cohesion were defined as a random parameters and characterized by two dimensional random fields. A friction angle was described by a bounded distribution as it differs within limited range. While a lognormal distribution was applied in case of a cohesion. Other properties - Young's modulus, Poisson's ratio and unit weight were assumed as deterministic values because they have negligible influence on the stochastic bearing capacity. Griffiths D. V., & Fenton G. A. (1993). Seepage beneath water retaining structures founded on spatially random soil. Géotechnique, 43(6), 577-587.
Computing under-ice discharge: A proof-of-concept using hydroacoustics and the Probability Concept

NASA Astrophysics Data System (ADS)

Fulton, John W.; Henneberg, Mark F.; Mills, Taylor J.; Kohn, Michael S.; Epstein, Brian; Hittle, Elizabeth A.; Damschen, William C.; Laveau, Christopher D.; Lambrecht, Jason M.; Farmer, William H.

2018-07-01

Under-ice discharge is estimated using open-water reference hydrographs; however, the ratings for ice-affected sites are generally qualified as poor. The U.S. Geological Survey (USGS), in collaboration with the Colorado Water Conservation Board, conducted a proof-of-concept to develop an alternative method for computing under-ice discharge using hydroacoustics and the Probability Concept. The study site was located south of Minturn, Colorado (CO), USA, and was selected because of (1) its proximity to the existing USGS streamgage 09064600 Eagle River near Minturn, CO, and (2) its ease-of-access to verify discharge using a variety of conventional methods. From late September 2014 to early March 2015, hydraulic conditions varied from open water to under ice. These temporal changes led to variations in water depth and velocity. Hydroacoustics (tethered and uplooking acoustic Doppler current profilers and acoustic Doppler velocimeters) were deployed to measure the vertical-velocity profile at a singularly important vertical of the channel-cross section. Because the velocity profile was non-standard and cannot be characterized using a Power Law or Log Law, velocity data were analyzed using the Probability Concept, which is a probabilistic formulation of the velocity distribution. The Probability Concept-derived discharge was compared to conventional methods including stage-discharge and index-velocity ratings and concurrent field measurements; each is complicated by the dynamics of ice formation, pressure influences on stage measurements, and variations in cross-sectional area due to ice formation. No particular discharge method was assigned as truth. Rather one statistical metric (Kolmogorov-Smirnov; KS), agreement plots, and concurrent measurements provided a measure of comparability between various methods. Regardless of the method employed, comparisons between each method revealed encouraging results depending on the flow conditions and the absence or presence of ice cover. For example, during lower discharges dominated by under-ice and transition (intermittent open-water and under-ice) conditions, the KS metric suggests there is not sufficient information to reject the null hypothesis and implies that the Probability Concept and index-velocity rating represent similar distributions. During high-flow, open-water conditions, the comparisons are less definitive; therefore, it is important that the appropriate analytical method and instrumentation be selected. Six conventional discharge measurements were collected concurrently with Probability Concept-derived discharges with percent differences (%) of -9.0%, -21%, -8.6%, 17.8%, 3.6%, and -2.3%. This proof-of-concept demonstrates that riverine discharges can be computed using the Probability Concept for a range of hydraulic extremes (variations in discharge, open-water and under-ice conditions) immediately after the siting phase is complete, which typically requires one day. Computing real-time discharges is particularly important at sites, where (1) new streamgages are planned, (2) river hydraulics are complex, and (3) shifts in the stage-discharge rating are needed to correct the streamflow record. Use of the Probability Concept does not preclude the need to maintain a stage-area relation. Both the Probability Concept and index-velocity rating offer water-resource managers and decision makers alternatives for computing real-time discharge for open-water and under-ice conditions.
Computing under-ice discharge: A proof-of-concept using hydroacoustics and the Probability Concept

USGS Publications Warehouse

Fulton, John W.; Henneberg, Mark F.; Mills, Taylor J.; Kohn, Michael S.; Epstein, Brian; Hittle, Elizabeth A.; Damschen, William C.; Laveau, Christopher D.; Lambrecht, Jason M.; Farmer, William H.

2018-01-01

Under-ice discharge is estimated using open-water reference hydrographs; however, the ratings for ice-affected sites are generally qualified as poor. The U.S. Geological Survey (USGS), in collaboration with the Colorado Water Conservation Board, conducted a proof-of-concept to develop an alternative method for computing under-ice discharge using hydroacoustics and the Probability Concept.The study site was located south of Minturn, Colorado (CO), USA, and was selected because of (1) its proximity to the existing USGS streamgage 09064600 Eagle River near Minturn, CO, and (2) its ease-of-access to verify discharge using a variety of conventional methods. From late September 2014 to early March 2015, hydraulic conditions varied from open water to under ice. These temporal changes led to variations in water depth and velocity. Hydroacoustics (tethered and uplooking acoustic Doppler current profilers and acoustic Doppler velocimeters) were deployed to measure the vertical-velocity profile at a singularly important vertical of the channel-cross section. Because the velocity profile was non-standard and cannot be characterized using a Power Law or Log Law, velocity data were analyzed using the Probability Concept, which is a probabilistic formulation of the velocity distribution. The Probability Concept-derived discharge was compared to conventional methods including stage-discharge and index-velocity ratings and concurrent field measurements; each is complicated by the dynamics of ice formation, pressure influences on stage measurements, and variations in cross-sectional area due to ice formation.No particular discharge method was assigned as truth. Rather one statistical metric (Kolmogorov-Smirnov; KS), agreement plots, and concurrent measurements provided a measure of comparability between various methods. Regardless of the method employed, comparisons between each method revealed encouraging results depending on the flow conditions and the absence or presence of ice cover.For example, during lower discharges dominated by under-ice and transition (intermittent open-water and under-ice) conditions, the KS metric suggests there is not sufficient information to reject the null hypothesis and implies that the Probability Concept and index-velocity rating represent similar distributions. During high-flow, open-water conditions, the comparisons are less definitive; therefore, it is important that the appropriate analytical method and instrumentation be selected. Six conventional discharge measurements were collected concurrently with Probability Concept-derived discharges with percent differences (%) of −9.0%, −21%, −8.6%, 17.8%, 3.6%, and −2.3%.This proof-of-concept demonstrates that riverine discharges can be computed using the Probability Concept for a range of hydraulic extremes (variations in discharge, open-water and under-ice conditions) immediately after the siting phase is complete, which typically requires one day. Computing real-time discharges is particularly important at sites, where (1) new streamgages are planned, (2) river hydraulics are complex, and (3) shifts in the stage-discharge rating are needed to correct the streamflow record. Use of the Probability Concept does not preclude the need to maintain a stage-area relation. Both the Probability Concept and index-velocity rating offer water-resource managers and decision makers alternatives for computing real-time discharge for open-water and under-ice conditions.
I Environmental DNA sampling is more sensitive than a traditional survey technique for detecting an aquatic invader.

PubMed

Smart, Adam S; Tingley, Reid; Weeks, Andrew R; van Rooyen, Anthony R; McCarthy, Michael A

2015-10-01

Effective management of alien species requires detecting populations in the early stages of invasion. Environmental DNA (eDNA) sampling can detect aquatic species at relatively low densities, but few studies have directly compared detection probabilities of eDNA sampling with those of traditional sampling methods. We compare the ability of a traditional sampling technique (bottle trapping) and eDNA to detect a recently established invader, the smooth newt Lissotriton vulgaris vulgaris, at seven field sites in Melbourne, Australia. Over a four-month period, per-trap detection probabilities ranged from 0.01 to 0.26 among sites where L. v. vulgaris was detected, whereas per-sample eDNA estimates were much higher (0.29-1.0). Detection probabilities of both methods varied temporally (across days and months), but temporal variation appeared to be uncorrelated between methods. Only estimates of spatial variation were strongly correlated across the two sampling techniques. Environmental variables (water depth, rainfall, ambient temperature) were not clearly correlated with detection probabilities estimated via trapping, whereas eDNA detection probabilities were negatively correlated with water depth, possibly reflecting higher eDNA concentrations at lower water levels. Our findings demonstrate that eDNA sampling can be an order of magnitude more sensitive than traditional methods, and illustrate that traditional- and eDNA-based surveys can provide independent information on species distributions when occupancy surveys are conducted over short timescales.
Incorporating Skew into RMS Surface Roughness Probability Distribution

NASA Technical Reports Server (NTRS)

Stahl, Mark T.; Stahl, H. Philip.

2013-01-01

The standard treatment of RMS surface roughness data is the application of a Gaussian probability distribution. This handling of surface roughness ignores the skew present in the surface and overestimates the most probable RMS of the surface, the mode. Using experimental data we confirm the Gaussian distribution overestimates the mode and application of an asymmetric distribution provides a better fit. Implementing the proposed asymmetric distribution into the optical manufacturing process would reduce the polishing time required to meet surface roughness specifications.
Software Supportability Risk Assessment in OT&E (Operational Test and Evaluation): Literature Review, Current Research Review, and Data Base Assemblage.

DTIC Science & Technology

1984-09-28

variables before simula- tion of model - Search for reality checks a, - Express uncertainty as a probability density distribution. a. H2 a, H-22 TWIF... probability that the software con- tains errors. This prior is updated as test failure data are accumulated. Only a p of 1 (software known to contain...discusssed; both parametric and nonparametric versions are presented. It is shown by the author that the bootstrap underlies the jackknife method and
Intertime jump statistics of state-dependent Poisson processes.

PubMed

Daly, Edoardo; Porporato, Amilcare

2007-01-01

A method to obtain the probability distribution of the interarrival times of jump occurrences in systems driven by state-dependent Poisson noise is proposed. Such a method uses the survivor function obtained by a modified version of the master equation associated to the stochastic process under analysis. A model for the timing of human activities shows the capability of state-dependent Poisson noise to generate power-law distributions. The application of the method to a model for neuron dynamics and to a hydrological model accounting for land-atmosphere interaction elucidates the origin of characteristic recurrence intervals and possible persistence in state-dependent Poisson models.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent

We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less
Kolmogorov-Smirnov test for spatially correlated data

USGS Publications Warehouse

Olea, R.A.; Pawlowsky-Glahn, V.

2009-01-01

The Kolmogorov-Smirnov test is a convenient method for investigating whether two underlying univariate probability distributions can be regarded as undistinguishable from each other or whether an underlying probability distribution differs from a hypothesized distribution. Application of the test requires that the sample be unbiased and the outcomes be independent and identically distributed, conditions that are violated in several degrees by spatially continuous attributes, such as topographical elevation. A generalized form of the bootstrap method is used here for the purpose of modeling the distribution of the statistic D of the Kolmogorov-Smirnov test. The innovation is in the resampling, which in the traditional formulation of bootstrap is done by drawing from the empirical sample with replacement presuming independence. The generalization consists of preparing resamplings with the same spatial correlation as the empirical sample. This is accomplished by reading the value of unconditional stochastic realizations at the sampling locations, realizations that are generated by simulated annealing. The new approach was tested by two empirical samples taken from an exhaustive sample closely following a lognormal distribution. One sample was a regular, unbiased sample while the other one was a clustered, preferential sample that had to be preprocessed. Our results show that the p-value for the spatially correlated case is always larger that the p-value of the statistic in the absence of spatial correlation, which is in agreement with the fact that the information content of an uncorrelated sample is larger than the one for a spatially correlated sample of the same size. ?? Springer-Verlag 2008.

Latin Hypercube Sampling (LHS) UNIX Library/Standalone

DOE Office of Scientific and Technical Information (OSTI.GOV)

2004-05-13

The LHS UNIX Library/Standalone software provides the capability to draw random samples from over 30 distribution types. It performs the sampling by a stratified sampling method called Latin Hypercube Sampling (LHS). Multiple distributions can be sampled simultaneously, with user-specified correlations amongst the input distributions, LHS UNIX Library/ Standalone provides a way to generate multi-variate samples. The LHS samples can be generated either as a callable library (e.g., from within the DAKOTA software framework) or as a standalone capability. LHS UNIX Library/Standalone uses the Latin Hypercube Sampling method (LHS) to generate samples. LHS is a constrained Monte Carlo sampling scheme. Inmore » LHS, the range of each variable is divided into non-overlapping intervals on the basis of equal probability. A sample is selected at random with respect to the probability density in each interval, If multiple variables are sampled simultaneously, then values obtained for each are paired in a random manner with the n values of the other variables. In some cases, the pairing is restricted to obtain specified correlations amongst the input variables. Many simulation codes have input parameters that are uncertain and can be specified by a distribution, To perform uncertainty analysis and sensitivity analysis, random values are drawn from the input parameter distributions, and the simulation is run with these values to obtain output values. If this is done repeatedly, with many input samples drawn, one can build up a distribution of the output as well as examine correlations between input and output variables.« less
Statistical methods of fracture characterization using acoustic borehole televiewer log interpretation

NASA Astrophysics Data System (ADS)

Massiot, Cécile; Townend, John; Nicol, Andrew; McNamara, David D.

2017-08-01

Acoustic borehole televiewer (BHTV) logs provide measurements of fracture attributes (orientations, thickness, and spacing) at depth. Orientation, censoring, and truncation sampling biases similar to those described for one-dimensional outcrop scanlines, and other logging or drilling artifacts specific to BHTV logs, can affect the interpretation of fracture attributes from BHTV logs. K-means, fuzzy K-means, and agglomerative clustering methods provide transparent means of separating fracture groups on the basis of their orientation. Fracture spacing is calculated for each of these fracture sets. Maximum likelihood estimation using truncated distributions permits the fitting of several probability distributions to the fracture attribute data sets within truncation limits, which can then be extrapolated over the entire range where they naturally occur. Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC) statistical information criteria rank the distributions by how well they fit the data. We demonstrate these attribute analysis methods with a data set derived from three BHTV logs acquired from the high-temperature Rotokawa geothermal field, New Zealand. Varying BHTV log quality reduces the number of input data points, but careful selection of the quality levels where fractures are deemed fully sampled increases the reliability of the analysis. Spacing data analysis comprising up to 300 data points and spanning three orders of magnitude can be approximated similarly well (similar AIC rankings) with several distributions. Several clustering configurations and probability distributions can often characterize the data at similar levels of statistical criteria. Thus, several scenarios should be considered when using BHTV log data to constrain numerical fracture models.
Fission and quasifission of composite systems with Z =108 -120 : Transition from heavy-ion reactions involving S and Ca to Ti and Ni ions

NASA Astrophysics Data System (ADS)

Kozulin, E. M.; Knyazheva, G. N.; Novikov, K. V.; Itkis, I. M.; Itkis, M. G.; Dmitriev, S. N.; Oganessian, Yu. Ts.; Bogachev, A. A.; Kozulina, N. I.; Harca, I.; Trzaska, W. H.; Ghosh, T. K.

2016-11-01

Background: Suppression of compound nucleus formation in the reactions with heavy ions by a quasifission process in dependence on the reaction entrance channel. Purpose: Investigation of fission and quasifission processes in the reactions 36S,48Ca,48Ti , and 64Ni+238U at energies around the Coulomb barrier. Methods: Mass-energy distributions of fissionlike fragments formed in the reaction 48Ti+238U at energies of 247, 258, and 271 MeV have been measured using the double-arm time-of-flight spectrometer CORSET at the U400 cyclotron of the Flerov Laboratory of Nuclear Reactions and compared with mass-energy distributions for the reactions 36S,48Ca,64Ni+238U . Results: The most probable fragment masses as well as total kinetic energies and their dispersions in dependence on the interaction energies have been investigated for asymmetric and symmetric fragments for the studied reactions. The fusion probabilities have been deduced from the analysis of mass-energy distributions. Conclusion: The estimated fusion probability for the reactions S, Ca, Ti, and Ni ions with actinide nuclei shows that it depends exponentially on the mean fissility parameter of the system. For the reactions with actinide nuclei leading to the formation of superheavy elements the fusion probabilities are of several orders of magnitude higher than in the case of cold fusion reactions.
A Gaussian Model-Based Probabilistic Approach for Pulse Transit Time Estimation.

PubMed

Jang, Dae-Geun; Park, Seung-Hun; Hahn, Minsoo

2016-01-01

In this paper, we propose a new probabilistic approach to pulse transit time (PTT) estimation using a Gaussian distribution model. It is motivated basically by the hypothesis that PTTs normalized by RR intervals follow the Gaussian distribution. To verify the hypothesis, we demonstrate the effects of arterial compliance on the normalized PTTs using the Moens-Korteweg equation. Furthermore, we observe a Gaussian distribution of the normalized PTTs on real data. In order to estimate the PTT using the hypothesis, we first assumed that R-waves in the electrocardiogram (ECG) can be correctly identified. The R-waves limit searching ranges to detect pulse peaks in the photoplethysmogram (PPG) and to synchronize the results with cardiac beats--i.e., the peaks of the PPG are extracted within the corresponding RR interval of the ECG as pulse peak candidates. Their probabilities of being the actual pulse peak are then calculated using a Gaussian probability function. The parameters of the Gaussian function are automatically updated when a new pulse peak is identified. This update makes the probability function adaptive to variations of cardiac cycles. Finally, the pulse peak is identified as the candidate with the highest probability. The proposed approach is tested on a database where ECG and PPG waveforms are collected simultaneously during the submaximal bicycle ergometer exercise test. The results are promising, suggesting that the method provides a simple but more accurate PTT estimation in real applications.
A method to estimate stellar ages from kinematical data

NASA Astrophysics Data System (ADS)

Almeida-Fernandes, F.; Rocha-Pinto, H. J.

2018-05-01

We present a method to build a probability density function (PDF) for the age of a star based on its peculiar velocities U, V, and W and its orbital eccentricity. The sample used in this work comes from the Geneva-Copenhagen Survey (GCS) that contains the spatial velocities, orbital eccentricities, and isochronal ages for about 14 000 stars. Using the GCS stars, we fitted the parameters that describe the relations between the distributions of kinematical properties and age. This parametrization allows us to obtain an age probability from the kinematical data. From this age PDF, we estimate an individual average age for the star using the most likely age and the expected age. We have obtained the stellar age PDF for the age of 9102 stars from the GCS and have shown that the distribution of individual ages derived from our method is in good agreement with the distribution of isochronal ages. We also observe a decline in the mean metallicity with our ages for stars younger than 7 Gyr, similar to the one observed for isochronal ages. This method can be useful for the estimation of rough stellar ages for those stars that fall in areas of the Hertzsprung-Russell diagram where isochrones are tightly crowded. As an example of this method, we estimate the age of Trappist-1, which is a M8V star, obtaining the age of t(UVW) = 12.50(+0.29 - 6.23) Gyr.
Vertical changes in the probability distribution of downward irradiance within the near-surface ocean under sunny conditions

NASA Astrophysics Data System (ADS)

Gernez, Pierre; Stramski, Dariusz; Darecki, Miroslaw

2011-07-01

Time series measurements of fluctuations in underwater downward irradiance, Ed, within the green spectral band (532 nm) show that the probability distribution of instantaneous irradiance varies greatly as a function of depth within the near-surface ocean under sunny conditions. Because of intense light flashes caused by surface wave focusing, the near-surface probability distributions are highly skewed to the right and are heavy tailed. The coefficients of skewness and excess kurtosis at depths smaller than 1 m can exceed 3 and 20, respectively. We tested several probability models, such as lognormal, Gumbel, Fréchet, log-logistic, and Pareto, which are potentially suited to describe the highly skewed heavy-tailed distributions. We found that the models cannot approximate with consistently good accuracy the high irradiance values within the right tail of the experimental distribution where the probability of these values is less than 10%. This portion of the distribution corresponds approximately to light flashes with Ed > 1.5?, where ? is the time-averaged downward irradiance. However, the remaining part of the probability distribution covering all irradiance values smaller than the 90th percentile can be described with a reasonable accuracy (i.e., within 20%) with a lognormal model for all 86 measurements from the top 10 m of the ocean included in this analysis. As the intensity of irradiance fluctuations decreases with depth, the probability distribution tends toward a function symmetrical around the mean like the normal distribution. For the examined data set, the skewness and excess kurtosis assumed values very close to zero at a depth of about 10 m.
An application of the Krylov-FSP-SSA method to parameter fitting with maximum likelihood

NASA Astrophysics Data System (ADS)

Dinh, Khanh N.; Sidje, Roger B.

2017-12-01

Monte Carlo methods such as the stochastic simulation algorithm (SSA) have traditionally been employed in gene regulation problems. However, there has been increasing interest to directly obtain the probability distribution of the molecules involved by solving the chemical master equation (CME). This requires addressing the curse of dimensionality that is inherent in most gene regulation problems. The finite state projection (FSP) seeks to address the challenge and there have been variants that further reduce the size of the projection or that accelerate the resulting matrix exponential. The Krylov-FSP-SSA variant has proved numerically efficient by combining, on one hand, the SSA to adaptively drive the FSP, and on the other hand, adaptive Krylov techniques to evaluate the matrix exponential. Here we apply this Krylov-FSP-SSA to a mutual inhibitory gene network synthetically engineered in Saccharomyces cerevisiae, in which bimodality arises. We show numerically that the approach can efficiently approximate the transient probability distribution, and this has important implications for parameter fitting, where the CME has to be solved for many different parameter sets. The fitting scheme amounts to an optimization problem of finding the parameter set so that the transient probability distributions fit the observations with maximum likelihood. We compare five optimization schemes for this difficult problem, thereby providing further insights into this approach of parameter estimation that is often applied to models in systems biology where there is a need to calibrate free parameters. Work supported by NSF grant DMS-1320849.
Fluctuations of thermodynamic quantities calculated from the fundamental equation of thermodynamics

NASA Astrophysics Data System (ADS)

Yan, Zijun; Chen, Jincan

1992-02-01

On the basis of the probability distribution of the various values of the fluctuation and the fundamental equation of thermodynamics of any given system, a simple and useful method of calculating the fluctuations is presented. By using the method, the fluctuations of thermodynamic quantities can be directly determined from the fundamental equation of thermodynamics. Finally, some examples are given to illustrate the use of the method.
How might Model-based Probabilities Extracted from Imperfect Models Guide Rational Decisions: The Case for non-probabilistic odds

NASA Astrophysics Data System (ADS)

Smith, Leonard A.

2010-05-01

This contribution concerns "deep" or "second-order" uncertainty, such as the uncertainty in our probability forecasts themselves. It asks the question: "Is it rational to take (or offer) bets using model-based probabilities as if they were objective probabilities?" If not, what alternative approaches for determining odds, perhaps non-probabilistic odds, might prove useful in practice, given the fact we know our models are imperfect? We consider the case where the aim is to provide sustainable odds: not to produce a profit but merely to rationally expect to break even in the long run. In other words, to run a quantified risk of ruin that is relatively small. Thus the cooperative insurance schemes of coastal villages provide a more appropriate parallel than a casino. A "better" probability forecast would lead to lower premiums charged and less volatile fluctuations in the cash reserves of the village. Note that the Bayesian paradigm does not constrain one to interpret model distributions as subjective probabilities, unless one believes the model to be empirically adequate for the task at hand. In geophysics, this is rarely the case. When a probability forecast is interpreted as the objective probability of an event, the odds on that event can be easily computed as one divided by the probability of the event, and one need not favour taking either side of the wager. (Here we are using "odds-for" not "odds-to", the difference being whether of not the stake is returned; odds of one to one are equivalent to odds of two for one.) The critical question is how to compute sustainable odds based on information from imperfect models. We suggest that this breaks the symmetry between the odds-on an event and the odds-against it. While a probability distribution can always be translated into odds, interpreting the odds on a set of events might result in "implied-probabilities" that sum to more than one. And/or the set of odds may be incomplete, not covering all events. We ask whether or not probabilities based on imperfect models can be expected to yield probabilistic odds which are sustainable. Evidence is provided that suggest this is not the case. Even with very good models (good in an Root-Mean-Square sense), the risk of ruin of probabilistic odds is significantly higher than might be expected. Methods for constructing model-based non-probabilistic odds which are sustainable are discussed. The aim here is to be relevant to real world decision support, and so unrealistic assumptions of equal knowledge, equal compute power, or equal access to information are to be avoided. Finally, the use of non-probabilistic odds as a method for communicating deep uncertainty (uncertainty in a probability forecast itself) is discussed in the context of other methods, such as stating one's subjective probability that the models will prove inadequate in each particular instance (that is, the Probability of a "Big Surprise").
Bayesian ionospheric multi-instrument 3D tomography

NASA Astrophysics Data System (ADS)

Norberg, Johannes; Vierinen, Juha; Roininen, Lassi

2017-04-01

The tomographic reconstruction of ionospheric electron densities is an inverse problem that cannot be solved without relatively strong regularising additional information. % Especially the vertical electron density profile is determined predominantly by the regularisation. % %Often utilised regularisations in ionospheric tomography include smoothness constraints and iterative methods with initial ionospheric models. % Despite its crucial role, the regularisation is often hidden in the algorithm as a numerical procedure without physical understanding. % % The Bayesian methodology provides an interpretative approach for the problem, as the regularisation can be given in a physically meaningful and quantifiable prior probability distribution. % The prior distribution can be based on ionospheric physics, other available ionospheric measurements and their statistics. % Updating the prior with measurements results as the posterior distribution that carries all the available information combined. % From the posterior distribution, the most probable state of the ionosphere can then be solved with the corresponding probability intervals. % Altogether, the Bayesian methodology provides understanding on how strong the given regularisation is, what is the information gained with the measurements and how reliable the final result is. % In addition, the combination of different measurements and temporal development can be taken into account in a very intuitive way. However, a direct implementation of the Bayesian approach requires inversion of large covariance matrices resulting in computational infeasibility. % In the presented method, Gaussian Markov random fields are used to form a sparse matrix approximations for the covariances. % The approach makes the problem computationally feasible while retaining the probabilistic and physical interpretation. Here, the Bayesian method with Gaussian Markov random fields is applied for ionospheric 3D tomography over Northern Europe. % Multi-instrument measurements are utilised from TomoScand receiver network for Low Earth orbit beacon satellite signals, GNSS receiver networks, as well as from EISCAT ionosondes and incoherent scatter radars. % %The performance is demonstrated in three-dimensional spatial domain with temporal development also taken into account.
Decision theory for computing variable and value ordering decisions for scheduling problems

NASA Technical Reports Server (NTRS)

Linden, Theodore A.

1993-01-01

Heuristics that guide search are critical when solving large planning and scheduling problems, but most variable and value ordering heuristics are sensitive to only one feature of the search state. One wants to combine evidence from all features of the search state into a subjective probability that a value choice is best, but there has been no solid semantics for merging evidence when it is conceived in these terms. Instead, variable and value ordering decisions should be viewed as problems in decision theory. This led to two key insights: (1) The fundamental concept that allows heuristic evidence to be merged is the net incremental utility that will be achieved by assigning a value to a variable. Probability distributions about net incremental utility can merge evidence from the utility function, binary constraints, resource constraints, and other problem features. The subjective probability that a value is the best choice is then derived from probability distributions about net incremental utility. (2) The methods used for rumor control in Bayesian Networks are the primary way to prevent cycling in the computation of probable net incremental utility. These insights lead to semantically justifiable ways to compute heuristic variable and value ordering decisions that merge evidence from all available features of the search state.
Itô-SDE MCMC method for Bayesian characterization of errors associated with data limitations in stochastic expansion methods for uncertainty quantification

NASA Astrophysics Data System (ADS)

Arnst, M.; Abello Álvarez, B.; Ponthot, J.-P.; Boman, R.

2017-11-01

This paper is concerned with the characterization and the propagation of errors associated with data limitations in polynomial-chaos-based stochastic methods for uncertainty quantification. Such an issue can arise in uncertainty quantification when only a limited amount of data is available. When the available information does not suffice to accurately determine the probability distributions that must be assigned to the uncertain variables, the Bayesian method for assigning these probability distributions becomes attractive because it allows the stochastic model to account explicitly for insufficiency of the available information. In previous work, such applications of the Bayesian method had already been implemented by using the Metropolis-Hastings and Gibbs Markov Chain Monte Carlo (MCMC) methods. In this paper, we present an alternative implementation, which uses an alternative MCMC method built around an Itô stochastic differential equation (SDE) that is ergodic for the Bayesian posterior. We draw together from the mathematics literature a number of formal properties of this Itô SDE that lend support to its use in the implementation of the Bayesian method, and we describe its discretization, including the choice of the free parameters, by using the implicit Euler method. We demonstrate the proposed methodology on a problem of uncertainty quantification in a complex nonlinear engineering application relevant to metal forming.
[Nonparametric method of estimating survival functions containing right-censored and interval-censored data].

PubMed

Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi

2014-04-01

Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.
Contact tracing and antiviral prophylaxis in the early stages of a pandemic: the probability of a major outbreak.

PubMed

Ross, Joshua V; Black, Andrew J

2015-09-01

Antiviral prophylaxis forms a significant component of health management plans for many countries around the world. A number of studies have shown that the delays typically encountered in distributing these antivirals to households, following the first infectious case, can result in their efficacy being severely reduced. Here, we investigate the use of contact tracing as a method to reduce the delays and hence mitigate the reduction in efficacy of antivirals. We assess the usefulness of contact tracing in terms of the probability of a major outbreak. It is found, with parameter distributions appropriate to the 2009 H1N1 pandemic and distributions reflecting commonly experienced delays, that standard contact tracing renders an outbreak impossible approximately one in five times compared with approximately one in ten times in its absence. A contact-tracing efficiency of 50% would see further improvements with an outbreak being impossible approximately one in four times, and a reduction of the median probability of a major outbreak from 0.41 to below 0.27. © The authors 2014. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.
Spatial distribution of traffic in a cellular mobile data network

NASA Astrophysics Data System (ADS)

Linnartz, J. P. M. G.

1987-02-01

The use of integral transforms of the probability density function for the received power to analyze the relation between the spatial distributions of offered and throughout packet traffic in a mobile radio network with Rayleigh fading channels and ALOHA multiple access was assessed. A method to obtain the spatial distribution of throughput traffic from a prescribed spatial distribution of offered traffic is presented. Incoherent and coherent addition of interference signals is considered. The channel behavior for heavy traffic loads is studied. In both the incoherent and coherent case, the spatial distribution of offered traffic required to ensure a prescribed spatially uniform throughput is synthesized numerically.
Integrated-Circuit Pseudorandom-Number Generator

NASA Technical Reports Server (NTRS)

Steelman, James E.; Beasley, Jeff; Aragon, Michael; Ramirez, Francisco; Summers, Kenneth L.; Knoebel, Arthur

1992-01-01

Integrated circuit produces 8-bit pseudorandom numbers from specified probability distribution, at rate of 10 MHz. Use of Boolean logic, circuit implements pseudorandom-number-generating algorithm. Circuit includes eight 12-bit pseudorandom-number generators, outputs are uniformly distributed. 8-bit pseudorandom numbers satisfying specified nonuniform probability distribution are generated by processing uniformly distributed outputs of eight 12-bit pseudorandom-number generators through "pipeline" of D flip-flops, comparators, and memories implementing conditional probabilities on zeros and ones.
A simulator for evaluating methods for the detection of lesion-deficit associations

NASA Technical Reports Server (NTRS)

Megalooikonomou, V.; Davatzikos, C.; Herskovits, E. H.

2000-01-01

Although much has been learned about the functional organization of the human brain through lesion-deficit analysis, the variety of statistical and image-processing methods developed for this purpose precludes a closed-form analysis of the statistical power of these systems. Therefore, we developed a lesion-deficit simulator (LDS), which generates artificial subjects, each of which consists of a set of functional deficits, and a brain image with lesions; the deficits and lesions conform to predefined distributions. We used probability distributions to model the number, sizes, and spatial distribution of lesions, to model the structure-function associations, and to model registration error. We used the LDS to evaluate, as examples, the effects of the complexities and strengths of lesion-deficit associations, and of registration error, on the power of lesion-deficit analysis. We measured the numbers of recovered associations from these simulated data, as a function of the number of subjects analyzed, the strengths and number of associations in the statistical model, the number of structures associated with a particular function, and the prior probabilities of structures being abnormal. The number of subjects required to recover the simulated lesion-deficit associations was found to have an inverse relationship to the strength of associations, and to the smallest probability in the structure-function model. The number of structures associated with a particular function (i.e., the complexity of associations) had a much greater effect on the performance of the analysis method than did the total number of associations. We also found that registration error of 5 mm or less reduces the number of associations discovered by approximately 13% compared to perfect registration. The LDS provides a flexible framework for evaluating many aspects of lesion-deficit analysis.
Implementation of the direct S ( α , β ) method in the KENO Monte Carlo code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hart, Shane W. D.; Maldonado, G. Ivan

The Monte Carlo code KENO contains thermal scattering data for a wide variety of thermal moderators. These data are processed from Evaluated Nuclear Data Files (ENDF) by AMPX and stored as double differential probability distribution functions. The method examined in this study uses S(α,β) probability distribution functions derived from the ENDF data files directly instead of being converted to double differential cross sections. This allows the size of the cross section data on the disk to be reduced substantially amount. KENO has also been updated to allow interpolation in temperature on these data so that problems can be run atmore » any temperature. Results are shown for several simplified problems for a variety of moderators. In addition, benchmark models based on the KRITZ reactor in Sweden were run, and the results are compared with the previous versions of KENO without the direct S(α,β) method. Results from the direct S(α,β) method compare favorably with the original results obtained using the double differential cross sections. Finally, sampling the data increases the run-time of the Monte Carlo calculation, but memory usage is decreased substantially.« less
Implementation of the direct S ( α , β ) method in the KENO Monte Carlo code

DOE PAGES

Hart, Shane W. D.; Maldonado, G. Ivan

2016-11-25

The Monte Carlo code KENO contains thermal scattering data for a wide variety of thermal moderators. These data are processed from Evaluated Nuclear Data Files (ENDF) by AMPX and stored as double differential probability distribution functions. The method examined in this study uses S(α,β) probability distribution functions derived from the ENDF data files directly instead of being converted to double differential cross sections. This allows the size of the cross section data on the disk to be reduced substantially amount. KENO has also been updated to allow interpolation in temperature on these data so that problems can be run atmore » any temperature. Results are shown for several simplified problems for a variety of moderators. In addition, benchmark models based on the KRITZ reactor in Sweden were run, and the results are compared with the previous versions of KENO without the direct S(α,β) method. Results from the direct S(α,β) method compare favorably with the original results obtained using the double differential cross sections. Finally, sampling the data increases the run-time of the Monte Carlo calculation, but memory usage is decreased substantially.« less
A novel method for energy harvesting simulation based on scenario generation

NASA Astrophysics Data System (ADS)

Wang, Zhe; Li, Taoshen; Xiao, Nan; Ye, Jin; Wu, Min

2018-06-01

Energy harvesting network (EHN) is a new form of computer networks. It converts ambient energy into usable electric energy and supply the electrical energy as a primary or secondary power source to the communication devices. However, most of the EHN uses the analytical probability distribution function to describe the energy harvesting process, which cannot accurately identify the actual situation for the lack of authenticity. We propose an EHN simulation method based on scenario generation in this paper. Firstly, instead of setting a probability distribution in advance, it uses optimal scenario reduction technology to generate representative scenarios in single period based on the historical data of the harvested energy. Secondly, it uses homogeneous simulated annealing algorithm to generate optimal daily energy harvesting scenario sequences to get a more accurate simulation of the random characteristics of the energy harvesting network. Then taking the actual wind power data as an example, the accuracy and stability of the method are verified by comparing with the real data. Finally, we cite an instance to optimize the network throughput, which indicate the feasibility and effectiveness of the method we proposed from the optimal solution and data analysis in energy harvesting simulation.

Disentangling rotational velocity distribution of stars

NASA Astrophysics Data System (ADS)

Curé, Michel; Rial, Diego F.; Cassetti, Julia; Christen, Alejandra

2017-11-01

Rotational speed is an important physical parameter of stars: knowing the distribution of stellar rotational velocities is essential for understanding stellar evolution. However, rotational speed cannot be measured directly and is instead the convolution between the rotational speed and the sine of the inclination angle vsin(i). The problem itself can be described via a Fredhoml integral of the first kind. A new method (Curé et al. 2014) to deconvolve this inverse problem and obtain the cumulative distribution function for stellar rotational velocities is based on the work of Chandrasekhar & Münch (1950). Another method to obtain the probability distribution function is Tikhonov regularization method (Christen et al. 2016). The proposed methods can be also applied to the mass ratio distribution of extrasolar planets and brown dwarfs (in binary systems, Curé et al. 2015). For stars in a cluster, where all members are gravitationally bounded, the standard assumption that rotational axes are uniform distributed over the sphere is questionable. On the basis of the proposed techniques a simple approach to model this anisotropy of rotational axes has been developed with the possibility to ``disentangling'' simultaneously both the rotational speed distribution and the orientation of rotational axes.
Bayesian probability of success for clinical trials using historical data

PubMed Central

Ibrahim, Joseph G.; Chen, Ming-Hui; Lakshminarayanan, Mani; Liu, Guanghan F.; Heyse, Joseph F.

2015-01-01

Developing sophisticated statistical methods for go/no-go decisions is crucial for clinical trials, as planning phase III or phase IV trials is costly and time consuming. In this paper, we develop a novel Bayesian methodology for determining the probability of success of a treatment regimen on the basis of the current data of a given trial. We introduce a new criterion for calculating the probability of success that allows for inclusion of covariates as well as allowing for historical data based on the treatment regimen, and patient characteristics. A new class of prior distributions and covariate distributions is developed to achieve this goal. The methodology is quite general and can be used with univariate or multivariate continuous or discrete data, and it generalizes Chuang-Stein’s work. This methodology will be invaluable for informing the scientist on the likelihood of success of the compound, while including the information of covariates for patient characteristics in the trial population for planning future pre-market or post-market trials. PMID:25339499
Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.

PubMed

Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J

2016-03-01

A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.
Bayesian probability of success for clinical trials using historical data.

PubMed

Ibrahim, Joseph G; Chen, Ming-Hui; Lakshminarayanan, Mani; Liu, Guanghan F; Heyse, Joseph F

2015-01-30

Developing sophisticated statistical methods for go/no-go decisions is crucial for clinical trials, as planning phase III or phase IV trials is costly and time consuming. In this paper, we develop a novel Bayesian methodology for determining the probability of success of a treatment regimen on the basis of the current data of a given trial. We introduce a new criterion for calculating the probability of success that allows for inclusion of covariates as well as allowing for historical data based on the treatment regimen, and patient characteristics. A new class of prior distributions and covariate distributions is developed to achieve this goal. The methodology is quite general and can be used with univariate or multivariate continuous or discrete data, and it generalizes Chuang-Stein's work. This methodology will be invaluable for informing the scientist on the likelihood of success of the compound, while including the information of covariates for patient characteristics in the trial population for planning future pre-market or post-market trials. Copyright © 2014 John Wiley & Sons, Ltd.
Prediction of fatty acid-binding residues on protein surfaces with three-dimensional probability distributions of interacting atoms.

PubMed

Mahalingam, Rajasekaran; Peng, Hung-Pin; Yang, An-Suei

2014-08-01

Protein-fatty acid interaction is vital for many cellular processes and understanding this interaction is important for functional annotation as well as drug discovery. In this work, we present a method for predicting the fatty acid (FA)-binding residues by using three-dimensional probability density distributions of interacting atoms of FAs on protein surfaces which are derived from the known protein-FA complex structures. A machine learning algorithm was established to learn the characteristic patterns of the probability density maps specific to the FA-binding sites. The predictor was trained with five-fold cross validation on a non-redundant training set and then evaluated with an independent test set as well as on holo-apo pair's dataset. The results showed good accuracy in predicting the FA-binding residues. Further, the predictor developed in this study is implemented as an online server which is freely accessible at the following website, http://ismblab.genomics.sinica.edu.tw/. Copyright © 2014 Elsevier B.V. All rights reserved.
Bivariate normal, conditional and rectangular probabilities: A computer program with applications

NASA Technical Reports Server (NTRS)

Swaroop, R.; Brownlow, J. D.; Ashwworth, G. R.; Winter, W. R.

1980-01-01

Some results for the bivariate normal distribution analysis are presented. Computer programs for conditional normal probabilities, marginal probabilities, as well as joint probabilities for rectangular regions are given: routines for computing fractile points and distribution functions are also presented. Some examples from a closed circuit television experiment are included.
Assessment of source probabilities for potential tsunamis affecting the U.S. Atlantic coast

USGS Publications Warehouse

Geist, E.L.; Parsons, T.

2009-01-01

Estimating the likelihood of tsunamis occurring along the U.S. Atlantic coast critically depends on knowledge of tsunami source probability. We review available information on both earthquake and landslide probabilities from potential sources that could generate local and transoceanic tsunamis. Estimating source probability includes defining both size and recurrence distributions for earthquakes and landslides. For the former distribution, source sizes are often distributed according to a truncated or tapered power-law relationship. For the latter distribution, sources are often assumed to occur in time according to a Poisson process, simplifying the way tsunami probabilities from individual sources can be aggregated. For the U.S. Atlantic coast, earthquake tsunami sources primarily occur at transoceanic distances along plate boundary faults. Probabilities for these sources are constrained from previous statistical studies of global seismicity for similar plate boundary types. In contrast, there is presently little information constraining landslide probabilities that may generate local tsunamis. Though there is significant uncertainty in tsunami source probabilities for the Atlantic, results from this study yield a comparative analysis of tsunami source recurrence rates that can form the basis for future probabilistic analyses.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less
Positive phase space distributions and uncertainty relations

NASA Technical Reports Server (NTRS)

Kruger, Jan

1993-01-01

In contrast to a widespread belief, Wigner's theorem allows the construction of true joint probabilities in phase space for distributions describing the object system as well as for distributions depending on the measurement apparatus. The fundamental role of Heisenberg's uncertainty relations in Schroedinger form (including correlations) is pointed out for these two possible interpretations of joint probability distributions. Hence, in order that a multivariate normal probability distribution in phase space may correspond to a Wigner distribution of a pure or a mixed state, it is necessary and sufficient that Heisenberg's uncertainty relation in Schroedinger form should be satisfied.
Appropriateness of the probability approach with a nutrient status biomarker to assess population inadequacy: a study using vitamin D123

PubMed Central

Carriquiry, Alicia L; Bailey, Regan L; Sempos, Christopher T; Yetley, Elizabeth A

2013-01-01

Background: There are questions about the appropriate method for the accurate estimation of the population prevalence of nutrient inadequacy on the basis of a biomarker of nutrient status (BNS). Objective: We determined the applicability of a statistical probability method to a BNS, specifically serum 25-hydroxyvitamin D [25(OH)D]. The ability to meet required statistical assumptions was the central focus. Design: Data on serum 25(OH)D concentrations in adults aged 19–70 y from the 2005–2006 NHANES were used (n = 3871). An Institute of Medicine report provided reference values. We analyzed key assumptions of symmetry, differences in variance, and the independence of distributions. We also corrected observed distributions for within-person variability (WPV). Estimates of vitamin D inadequacy were determined. Results: We showed that the BNS [serum 25(OH)D] met the criteria to use the method for the estimation of the prevalence of inadequacy. The difference between observations corrected compared with uncorrected for WPV was small for serum 25(OH)D but, nonetheless, showed enhanced accuracy because of correction. The method estimated a 19% prevalence of inadequacy in this sample, whereas misclassification inherent in the use of the more traditional 97.5th percentile high-end cutoff inflated the prevalence of inadequacy (36%). Conclusions: When the prevalence of nutrient inadequacy for a population is estimated by using serum 25(OH)D as an example of a BNS, a statistical probability method is appropriate and more accurate in comparison with a high-end cutoff. Contrary to a common misunderstanding, the method does not overlook segments of the population. The accuracy of population estimates of inadequacy is enhanced by the correction of observed measures for WPV. PMID:23097269
Modeling of magnitude distributions by the generalized truncated exponential distribution

NASA Astrophysics Data System (ADS)

Raschke, Mathias

2015-01-01

The probability distribution of the magnitude can be modeled by an exponential distribution according to the Gutenberg-Richter relation. Two alternatives are the truncated exponential distribution (TED) and the cutoff exponential distribution (CED). The TED is frequently used in seismic hazard analysis although it has a weak point: when two TEDs with equal parameters except the upper bound magnitude are mixed, then the resulting distribution is not a TED. Inversely, it is also not possible to split a TED of a seismic region into TEDs of subregions with equal parameters except the upper bound magnitude. This weakness is a principal problem as seismic regions are constructed scientific objects and not natural units. We overcome it by the generalization of the abovementioned exponential distributions: the generalized truncated exponential distribution (GTED). Therein, identical exponential distributions are mixed by the probability distribution of the correct cutoff points. This distribution model is flexible in the vicinity of the upper bound magnitude and is equal to the exponential distribution for smaller magnitudes. Additionally, the exponential distributions TED and CED are special cases of the GTED. We discuss the possible ways of estimating its parameters and introduce the normalized spacing for this purpose. Furthermore, we present methods for geographic aggregation and differentiation of the GTED and demonstrate the potential and universality of our simple approach by applying it to empirical data. The considerable improvement by the GTED in contrast to the TED is indicated by a large difference between the corresponding values of the Akaike information criterion.
Ubiquity of Benford's law and emergence of the reciprocal distribution

DOE PAGES

Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

2016-04-07

In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less
Electron density and electron temperature measurement in a bi-Maxwellian electron distribution using a derivative method of Langmuir probes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Choi, Ikjin; Chung, ChinWook; Youn Moon, Se

2013-08-15

In plasma diagnostics with a single Langmuir probe, the electron temperature T{sub e} is usually obtained from the slope of the logarithm of the electron current or from the electron energy probability functions of current (I)-voltage (V) curve. Recently, Chen [F. F. Chen, Phys. Plasmas 8, 3029 (2001)] suggested a derivative analysis method to obtain T{sub e} by the ratio between the probe current and the derivative of the probe current at a plasma potential where the ion current becomes zero. Based on this method, electron temperatures and electron densities were measured and compared with those from the electron energymore » distribution function (EEDF) measurement in Maxwellian and bi-Maxwellian electron distribution conditions. In a bi-Maxwellian electron distribution, we found the electron temperature T{sub e} obtained from the method is always lower than the effective temperatures T{sub eff} derived from EEDFs. The theoretical analysis for this is presented.« less
Direct calculation of liquid-vapor phase equilibria from transition matrix Monte Carlo simulation

NASA Astrophysics Data System (ADS)

Errington, Jeffrey R.

2003-06-01

An approach for directly determining the liquid-vapor phase equilibrium of a model system at any temperature along the coexistence line is described. The method relies on transition matrix Monte Carlo ideas developed by Fitzgerald, Picard, and Silver [Europhys. Lett. 46, 282 (1999)]. During a Monte Carlo simulation attempted transitions between states along the Markov chain are monitored as opposed to tracking the number of times the chain visits a given state as is done in conventional simulations. Data collection is highly efficient and very precise results are obtained. The method is implemented in both the grand canonical and isothermal-isobaric ensemble. The main result from a simulation conducted at a given temperature is a density probability distribution for a range of densities that includes both liquid and vapor states. Vapor pressures and coexisting densities are calculated in a straightforward manner from the probability distribution. The approach is demonstrated with the Lennard-Jones fluid. Coexistence properties are directly calculated at temperatures spanning from the triple point to the critical point.
A Bayesian approach to microwave precipitation profile retrieval

NASA Technical Reports Server (NTRS)

Evans, K. Franklin; Turk, Joseph; Wong, Takmeng; Stephens, Graeme L.

1995-01-01

A multichannel passive microwave precipitation retrieval algorithm is developed. Bayes theorem is used to combine statistical information from numerical cloud models with forward radiative transfer modeling. A multivariate lognormal prior probability distribution contains the covariance information about hydrometeor distribution that resolves the nonuniqueness inherent in the inversion process. Hydrometeor profiles are retrieved by maximizing the posterior probability density for each vector of observations. The hydrometeor profile retrieval method is tested with data from the Advanced Microwave Precipitation Radiometer (10, 19, 37, and 85 GHz) of convection over ocean and land in Florida. The CP-2 multiparameter radar data are used to verify the retrieved profiles. The results show that the method can retrieve approximate hydrometeor profiles, with larger errors over land than water. There is considerably greater accuracy in the retrieval of integrated hydrometeor contents than of profiles. Many of the retrieval errors are traced to problems with the cloud model microphysical information, and future improvements to the algorithm are suggested.
The Approximate Bayesian Computation methods in the localization of the atmospheric contamination source

NASA Astrophysics Data System (ADS)

Kopka, P.; Wawrzynczak, A.; Borysiewicz, M.

2015-09-01

In many areas of application, a central problem is a solution to the inverse problem, especially estimation of the unknown model parameters to model the underlying dynamics of a physical system precisely. In this situation, the Bayesian inference is a powerful tool to combine observed data with prior knowledge to gain the probability distribution of searched parameters. We have applied the modern methodology named Sequential Approximate Bayesian Computation (S-ABC) to the problem of tracing the atmospheric contaminant source. The ABC is technique commonly used in the Bayesian analysis of complex models and dynamic system. Sequential methods can significantly increase the efficiency of the ABC. In the presented algorithm, the input data are the on-line arriving concentrations of released substance registered by distributed sensor network from OVER-LAND ATMOSPHERIC DISPERSION (OLAD) experiment. The algorithm output are the probability distributions of a contamination source parameters i.e. its particular location, release rate, speed and direction of the movement, start time and duration. The stochastic approach presented in this paper is completely general and can be used in other fields where the parameters of the model bet fitted to the observable data should be found.
Earth Observing System Covariance Realism

NASA Technical Reports Server (NTRS)

Zaidi, Waqar H.; Hejduk, Matthew D.

2016-01-01

The purpose of covariance realism is to properly size a primary object's covariance in order to add validity to the calculation of the probability of collision. The covariance realism technique in this paper consists of three parts: collection/calculation of definitive state estimates through orbit determination, calculation of covariance realism test statistics at each covariance propagation point, and proper assessment of those test statistics. An empirical cumulative distribution function (ECDF) Goodness-of-Fit (GOF) method is employed to determine if a covariance is properly sized by comparing the empirical distribution of Mahalanobis distance calculations to the hypothesized parent 3-DoF chi-squared distribution. To realistically size a covariance for collision probability calculations, this study uses a state noise compensation algorithm that adds process noise to the definitive epoch covariance to account for uncertainty in the force model. Process noise is added until the GOF tests pass a group significance level threshold. The results of this study indicate that when outliers attributed to persistently high or extreme levels of solar activity are removed, the aforementioned covariance realism compensation method produces a tuned covariance with up to 80 to 90% of the covariance propagation timespan passing (against a 60% minimum passing threshold) the GOF tests-a quite satisfactory and useful result.
Statistics of Optical Coherence Tomography Data From Human Retina

PubMed Central

de Juan, Joaquín; Ferrone, Claudia; Giannini, Daniela; Huang, David; Koch, Giorgio; Russo, Valentina; Tan, Ou; Bruni, Carlo

2010-01-01

Optical coherence tomography (OCT) has recently become one of the primary methods for noninvasive probing of the human retina. The pseudoimage formed by OCT (the so-called B-scan) varies probabilistically across pixels due to complexities in the measurement technique. Hence, sensitive automatic procedures of diagnosis using OCT may exploit statistical analysis of the spatial distribution of reflectance. In this paper, we perform a statistical study of retinal OCT data. We find that the stretched exponential probability density function can model well the distribution of intensities in OCT pseudoimages. Moreover, we show a small, but significant correlation between neighbor pixels when measuring OCT intensities with pixels of about 5 µm. We then develop a simple joint probability model for the OCT data consistent with known retinal features. This model fits well the stretched exponential distribution of intensities and their spatial correlation. In normal retinas, fit parameters of this model are relatively constant along retinal layers, but varies across layers. However, in retinas with diabetic retinopathy, large spikes of parameter modulation interrupt the constancy within layers, exactly where pathologies are visible. We argue that these results give hope for improvement in statistical pathology-detection methods even when the disease is in its early stages. PMID:20304733
Statistical Orbit Determination using the Particle Filter for Incorporating Non-Gaussian Uncertainties

NASA Technical Reports Server (NTRS)

Mashiku, Alinda; Garrison, James L.; Carpenter, J. Russell

2012-01-01

The tracking of space objects requires frequent and accurate monitoring for collision avoidance. As even collision events with very low probability are important, accurate prediction of collisions require the representation of the full probability density function (PDF) of the random orbit state. Through representing the full PDF of the orbit state for orbit maintenance and collision avoidance, we can take advantage of the statistical information present in the heavy tailed distributions, more accurately representing the orbit states with low probability. The classical methods of orbit determination (i.e. Kalman Filter and its derivatives) provide state estimates based on only the second moments of the state and measurement errors that are captured by assuming a Gaussian distribution. Although the measurement errors can be accurately assumed to have a Gaussian distribution, errors with a non-Gaussian distribution could arise during propagation between observations. Moreover, unmodeled dynamics in the orbit model could introduce non-Gaussian errors into the process noise. A Particle Filter (PF) is proposed as a nonlinear filtering technique that is capable of propagating and estimating a more complete representation of the state distribution as an accurate approximation of a full PDF. The PF uses Monte Carlo runs to generate particles that approximate the full PDF representation. The PF is applied in the estimation and propagation of a highly eccentric orbit and the results are compared to the Extended Kalman Filter and Splitting Gaussian Mixture algorithms to demonstrate its proficiency.
Convective Weather Forecast Quality Metrics for Air Traffic Management Decision-Making

NASA Technical Reports Server (NTRS)

Chatterji, Gano B.; Gyarfas, Brett; Chan, William N.; Meyn, Larry A.

2006-01-01

Since numerical weather prediction models are unable to accurately forecast the severity and the location of the storm cells several hours into the future when compared with observation data, there has been a growing interest in probabilistic description of convective weather. The classical approach for generating uncertainty bounds consists of integrating the state equations and covariance propagation equations forward in time. This step is readily recognized as the process update step of the Kalman Filter algorithm. The second well known method, known as the Monte Carlo method, consists of generating output samples by driving the forecast algorithm with input samples selected from distributions. The statistical properties of the distributions of the output samples are then used for defining the uncertainty bounds of the output variables. This method is computationally expensive for a complex model compared to the covariance propagation method. The main advantage of the Monte Carlo method is that a complex non-linear model can be easily handled. Recently, a few different methods for probabilistic forecasting have appeared in the literature. A method for computing probability of convection in a region using forecast data is described in Ref. 5. Probability at a grid location is computed as the fraction of grid points, within a box of specified dimensions around the grid location, with forecast convection precipitation exceeding a specified threshold. The main limitation of this method is that the results are dependent on the chosen dimensions of the box. The examples presented Ref. 5 show that this process is equivalent to low-pass filtering of the forecast data with a finite support spatial filter. References 6 and 7 describe the technique for computing percentage coverage within a 92 x 92 square-kilometer box and assigning the value to the center 4 x 4 square-kilometer box. This technique is same as that described in Ref. 5. Characterizing the forecast, following the process described in Refs. 5 through 7, in terms of percentage coverage or confidence level is notionally sound compared to characterizing in terms of probabilities because the probability of the forecast being correct can only be determined using actual observations. References 5 through 7 only use the forecast data and not the observations. The method for computing the probability of detection, false alarm ratio and several forecast quality metrics (Skill Scores) using both the forecast and observation data are given in Ref. 2. This paper extends the statistical verification method in Ref. 2 to determine co-occurrence probabilities. The method consists of computing the probability that a severe weather cell (grid location) is detected in the observation data in the neighborhood of the severe weather cell in the forecast data. Probabilities of occurrence at the grid location and in its neighborhood with higher severity, and with lower severity in the observation data compared to that in the forecast data are examined. The method proposed in Refs. 5 through 7 is used for computing the probability that a certain number of cells in the neighborhood of severe weather cells in the forecast data are seen as severe weather cells in the observation data. Finally, the probability of existence of gaps in the observation data in the neighborhood of severe weather cells in forecast data is computed. Gaps are defined as openings between severe weather cells through which an aircraft can safely fly to its intended destination. The rest of the paper is organized as follows. Section II summarizes the statistical verification method described in Ref. 2. The extension of this method for computing the co-occurrence probabilities in discussed in Section HI. Numerical examples using NCWF forecast data and NCWD observation data are presented in Section III to elucidate the characteristics of the co-occurrence probabilities. This section also discusses the procedure for computing throbabilities that the severity of convection in the observation data will be higher or lower in the neighborhood of grid locations compared to that indicated at the grid locations in the forecast data. The probability of coverage of neighborhood grid cells is also described via examples in this section. Section IV discusses the gap detection algorithm and presents a numerical example to illustrate the method. The locations of the detected gaps in the observation data are used along with the locations of convective weather cells in the forecast data to determine the probability of existence of gaps in the neighborhood of these cells. Finally, the paper is concluded in Section V.

Measurement of Plutonium-240 Angular Momentum Dependent Fission Probabilities Using the Alpha-Alpha' Reaction

NASA Astrophysics Data System (ADS)

Koglin, Johnathon

Accurate nuclear reaction data from a few keV to tens of MeV and across the table of nuclides is essential to a number of applications of nuclear physics, including national security, nuclear forensics, nuclear astrophysics, and nuclear energy. Precise determination of (n, f) and neutron capture cross sections for reactions in high- ux environments are particularly important for a proper understanding of nuclear reactor performance and stellar nucleosynthesis. In these extreme environments reactions on short-lived and otherwise difficult-to-produce isotopes play a significant role in system evolution and provide insights into the types of nuclear processes taking place; a detailed understanding of these processes is necessary to properly determine cross sections far from stability. Indirect methods are often attempted to measure cross sections on isotopes that are difficult to separate in a laboratory setting. Using the surrogate approach, the same compound nucleus from the reaction of interest is created through a "surrogate" reaction on a different isotope and the resulting decay is measured. This result is combined with appropriate reaction theory for compound nucleus population, from which the desired cross sections can be inferred. This method has shown promise, but the theoretical framework often lacks necessary experimental data to constrain models. In this work, dual arrays of silicon telescope particle identification detectors and photovoltaic (solar) cell fission fragment detectors have been used to measure the fission probability of the 240Pu(alpha, alpha'f) reaction - a surrogate for the 239Pu(n, f) - and fission of 35.9(2)MeV at eleven scattering angles from 40° to 140° in 10° intervals and at nuclear excitation energies up to 16MeV. Within experimental uncertainty, the maximum fission probability was observed at the neutron separation energy for each alpha scattering angle. Fission probabilities were separated into five 500 keV bins from 5:5MeV to 8:0MeV and one bin from 4:5MeV to 5:5MeV. Across energy bins the fission probability increases approximately linearly with increasing alpha' scattering angle. At 90° the fission probability increases from 0:069(6) in the lowest energy bin to 0:59(2) in the highest. Likewise, within a single energy bin the fission probability increases with alpha' scattering angle. Within the 6:5MeV and 7:0MeV energy bin, the fission probability increased from 0:41(1) at 60° to 0:81(10) at 140°. Fission fragment angular distributions were also measured integrated over each energy bin. These distributions were fit to theoretical distributions based on combinations of transitional nuclear vibrational and rotational excitations at the saddle point. Contributions from specific K vibrational states were extracted and combined with fission probability measurements to determine the relative fission probability of each state as a function of nuclear excitation energy. Within a given excitation energy bin, it is found that contributions from K states greater than the minimum K = 0 state tend to increase with the increasing alpha' scattering angle. This is attributed to an increase in the transferred angular momentum associated with larger scattering angles. The 90° alpha' scattering angle produced the highest quality results. The relative contributions of K states do not show a discernible trend across the energy spectrum. The energy-binned results confirm existing measurements that place a K = 2 state in the first energy bin with the opening of K = 1 and K = 4 states at energies above 5:5MeV. This experiment represents the first of its kind in which fission probabilities and angular distributions are simultaneously measured at a large number of scattering angles. The acquired fission probability, angular distribution, and K state contribution provide a diverse dataset against which microscopic fission models can be constrained and further the understanding of the properties of the 240Pu fission.
Some New Approaches to Multivariate Probability Distributions.

DTIC Science & Technology

1986-12-01

Krishnaiah (1977). The following example may serve as an illustration of this point. EXAMPLE 2. (Fre^*chet’s bivariate continuous distribution...the error in the theorem of "" Prakasa Rao (1974) and to Dr. P.R. Krishnaiah for his valuable comments on the initial draft, his monumental patience and...M. and Proschan, F. (1984). Nonparametric Concepts and Methods in Reliability, Handbook of Statistics, 4, 613-655, (eds. P.R. Krishnaiah and P.K
Local linear estimation of concordance probability with application to covariate effects models on association for bivariate failure-time data.

PubMed

Ding, Aidong Adam; Hsieh, Jin-Jian; Wang, Weijing

2015-01-01

Bivariate survival analysis has wide applications. In the presence of covariates, most literature focuses on studying their effects on the marginal distributions. However covariates can also affect the association between the two variables. In this article we consider the latter issue by proposing a nonstandard local linear estimator for the concordance probability as a function of covariates. Under the Clayton copula, the conditional concordance probability has a simple one-to-one correspondence with the copula parameter for different data structures including those subject to independent or dependent censoring and dependent truncation. The proposed method can be used to study how covariates affect the Clayton association parameter without specifying marginal regression models. Asymptotic properties of the proposed estimators are derived and their finite-sample performances are examined via simulations. Finally, for illustration, we apply the proposed method to analyze a bone marrow transplant data set.
Employing Sensitivity Derivatives for Robust Optimization under Uncertainty in CFD

NASA Technical Reports Server (NTRS)

Newman, Perry A.; Putko, Michele M.; Taylor, Arthur C., III

2004-01-01

A robust optimization is demonstrated on a two-dimensional inviscid airfoil problem in subsonic flow. Given uncertainties in statistically independent, random, normally distributed flow parameters (input variables), an approximate first-order statistical moment method is employed to represent the Computational Fluid Dynamics (CFD) code outputs as expected values with variances. These output quantities are used to form the objective function and constraints. The constraints are cast in probabilistic terms; that is, the probability that a constraint is satisfied is greater than or equal to some desired target probability. Gradient-based robust optimization of this stochastic problem is accomplished through use of both first and second-order sensitivity derivatives. For each robust optimization, the effect of increasing both input standard deviations and target probability of constraint satisfaction are demonstrated. This method provides a means for incorporating uncertainty when considering small deviations from input mean values.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, Shih-Jung

Dynamic strength of the High Flux Isotope Reactor (HFIR) vessel to resist hypothetical accidents is analyzed by using the method of fracture mechanics. Vessel critical stresses are estimated by applying dynamic pressure pulses of a range of magnitudes and pulse-durations. The pulses versus time functions are assumed to be step functions. The probability of vessel fracture is then calculated by assuming a distribution of possible surface cracks of different crack depths. The probability distribution function for the crack depths is based on the form that is recommended by the Marshall report. The toughness of the vessel steel used in themore » analysis is based on the projected and embrittled value after 10 effective full power years from 1986. From the study made by Cheverton, Merkle and Nanstad, the weakest point on the vessel for fracture evaluation is known to be located within the region surrounding the tangential beam tube HB3. The increase in the probability of fracture is obtained as an extension of the result from that report for the regular operating condition to include conditions of higher dynamic pressures due to accident loadings. The increase in the probability of vessel fracture is plotted for a range of hoop stresses to indicate the vessel strength against hypothetical accident conditions.« less
The application of structural reliability techniques to plume impingement loading of the Space Station Freedom Photovoltaic Array

NASA Technical Reports Server (NTRS)

Yunis, Isam S.; Carney, Kelly S.

1993-01-01

A new aerospace application of structural reliability techniques is presented, where the applied forces depend on many probabilistic variables. This application is the plume impingement loading of the Space Station Freedom Photovoltaic Arrays. When the space shuttle berths with Space Station Freedom it must brake and maneuver towards the berthing point using its primary jets. The jet exhaust, or plume, may cause high loads on the photovoltaic arrays. The many parameters governing this problem are highly uncertain and random. An approach, using techniques from structural reliability, as opposed to the accepted deterministic methods, is presented which assesses the probability of failure of the array mast due to plume impingement loading. A Monte Carlo simulation of the berthing approach is used to determine the probability distribution of the loading. A probability distribution is also determined for the strength of the array. Structural reliability techniques are then used to assess the array mast design. These techniques are found to be superior to the standard deterministic dynamic transient analysis, for this class of problem. The results show that the probability of failure of the current array mast design, during its 15 year life, is minute.
The EM Method in a Probabilistic Wavelet-Based MRI Denoising

PubMed Central

2015-01-01

Human body heat emission and others external causes can interfere in magnetic resonance image acquisition and produce noise. In this kind of images, the noise, when no signal is present, is Rayleigh distributed and its wavelet coefficients can be approximately modeled by a Gaussian distribution. Noiseless magnetic resonance images can be modeled by a Laplacian distribution in the wavelet domain. This paper proposes a new magnetic resonance image denoising method to solve this fact. This method performs shrinkage of wavelet coefficients based on the conditioned probability of being noise or detail. The parameters involved in this filtering approach are calculated by means of the expectation maximization (EM) method, which avoids the need to use an estimator of noise variance. The efficiency of the proposed filter is studied and compared with other important filtering techniques, such as Nowak's, Donoho-Johnstone's, Awate-Whitaker's, and nonlocal means filters, in different 2D and 3D images. PMID:26089959
The EM Method in a Probabilistic Wavelet-Based MRI Denoising.

PubMed

Martin-Fernandez, Marcos; Villullas, Sergio

2015-01-01

Human body heat emission and others external causes can interfere in magnetic resonance image acquisition and produce noise. In this kind of images, the noise, when no signal is present, is Rayleigh distributed and its wavelet coefficients can be approximately modeled by a Gaussian distribution. Noiseless magnetic resonance images can be modeled by a Laplacian distribution in the wavelet domain. This paper proposes a new magnetic resonance image denoising method to solve this fact. This method performs shrinkage of wavelet coefficients based on the conditioned probability of being noise or detail. The parameters involved in this filtering approach are calculated by means of the expectation maximization (EM) method, which avoids the need to use an estimator of noise variance. The efficiency of the proposed filter is studied and compared with other important filtering techniques, such as Nowak's, Donoho-Johnstone's, Awate-Whitaker's, and nonlocal means filters, in different 2D and 3D images.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method.

PubMed

Norris, Peter M; da Silva, Arlindo M

2016-07-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

NASA Technical Reports Server (NTRS)

Norris, Peter M.; Da Silva, Arlindo M.

2016-01-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

PubMed Central

Norris, Peter M.; da Silva, Arlindo M.

2018-01-01

A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847
Stochastic seismic inversion based on an improved local gradual deformation method

NASA Astrophysics Data System (ADS)

Yang, Xiuwei; Zhu, Peimin

2017-12-01

A new stochastic seismic inversion method based on the local gradual deformation method is proposed, which can incorporate seismic data, well data, geology and their spatial correlations into the inversion process. Geological information, such as sedimentary facies and structures, could provide significant a priori information to constrain an inversion and arrive at reasonable solutions. The local a priori conditional cumulative distributions at each node of model to be inverted are first established by indicator cokriging, which integrates well data as hard data and geological information as soft data. Probability field simulation is used to simulate different realizations consistent with the spatial correlations and local conditional cumulative distributions. The corresponding probability field is generated by the fast Fourier transform moving average method. Then, optimization is performed to match the seismic data via an improved local gradual deformation method. Two improved strategies are proposed to be suitable for seismic inversion. The first strategy is that we select and update local areas of bad fitting between synthetic seismic data and real seismic data. The second one is that we divide each seismic trace into several parts and obtain the optimal parameters for each part individually. The applications to a synthetic example and a real case study demonstrate that our approach can effectively find fine-scale acoustic impedance models and provide uncertainty estimations.
Conflict Probability Estimation for Free Flight

NASA Technical Reports Server (NTRS)

Paielli, Russell A.; Erzberger, Heinz

1996-01-01

The safety and efficiency of free flight will benefit from automated conflict prediction and resolution advisories. Conflict prediction is based on trajectory prediction and is less certain the farther in advance the prediction, however. An estimate is therefore needed of the probability that a conflict will occur, given a pair of predicted trajectories and their levels of uncertainty. A method is developed in this paper to estimate that conflict probability. The trajectory prediction errors are modeled as normally distributed, and the two error covariances for an aircraft pair are combined into a single equivalent covariance of the relative position. A coordinate transformation is then used to derive an analytical solution. Numerical examples and Monte Carlo validation are presented.
Security Analysis of Measurement-Device-Independent Quantum Key Distribution in Collective-Rotation Noisy Environment

NASA Astrophysics Data System (ADS)

Li, Na; Zhang, Yu; Wen, Shuang; Li, Lei-lei; Li, Jian

2018-01-01

Noise is a problem that communication channels cannot avoid. It is, thus, beneficial to analyze the security of MDI-QKD in noisy environment. An analysis model for collective-rotation noise is introduced, and the information theory methods are used to analyze the security of the protocol. The maximum amount of information that Eve can eavesdrop is 50%, and the eavesdropping can always be detected if the noise level ɛ ≤ 0.68. Therefore, MDI-QKD protocol is secure as quantum key distribution protocol. The maximum probability that the relay outputs successful results is 16% when existing eavesdropping. Moreover, the probability that the relay outputs successful results when existing eavesdropping is higher than the situation without eavesdropping. The paper validates that MDI-QKD protocol has better robustness.
Connection between two statistical approaches for the modelling of particle velocity and concentration distributions in turbulent flow: The mesoscopic Eulerian formalism and the two-point probability density function method

NASA Astrophysics Data System (ADS)

Simonin, Olivier; Zaichik, Leonid I.; Alipchenkov, Vladimir M.; Février, Pierre

2006-12-01

The objective of the paper is to elucidate a connection between two approaches that have been separately proposed for modelling the statistical spatial properties of inertial particles in turbulent fluid flows. One of the approaches proposed recently by Février, Simonin, and Squires [J. Fluid Mech. 533, 1 (2005)] is based on the partitioning of particle turbulent velocity field into spatially correlated (mesoscopic Eulerian) and random-uncorrelated (quasi-Brownian) components. The other approach stems from a kinetic equation for the two-point probability density function of the velocity distributions of two particles [Zaichik and Alipchenkov, Phys. Fluids 15, 1776 (2003)]. Comparisons between these approaches are performed for isotropic homogeneous turbulence and demonstrate encouraging agreement.
Statistic inversion of multi-zone transition probability models for aquifer characterization in alluvial fans

DOE PAGES

Zhu, Lin; Dai, Zhenxue; Gong, Huili; ...

2015-06-12

Understanding the heterogeneity arising from the complex architecture of sedimentary sequences in alluvial fans is challenging. This study develops a statistical inverse framework in a multi-zone transition probability approach for characterizing the heterogeneity in alluvial fans. An analytical solution of the transition probability matrix is used to define the statistical relationships among different hydrofacies and their mean lengths, integral scales, and volumetric proportions. A statistical inversion is conducted to identify the multi-zone transition probability models and estimate the optimal statistical parameters using the modified Gauss–Newton–Levenberg–Marquardt method. The Jacobian matrix is computed by the sensitivity equation method, which results in anmore » accurate inverse solution with quantification of parameter uncertainty. We use the Chaobai River alluvial fan in the Beijing Plain, China, as an example for elucidating the methodology of alluvial fan characterization. The alluvial fan is divided into three sediment zones. In each zone, the explicit mathematical formulations of the transition probability models are constructed with optimized different integral scales and volumetric proportions. The hydrofacies distributions in the three zones are simulated sequentially by the multi-zone transition probability-based indicator simulations. Finally, the result of this study provides the heterogeneous structure of the alluvial fan for further study of flow and transport simulations.« less
A framework for sensitivity analysis of decision trees.

PubMed

Kamiński, Bogumił; Jakubczyk, Michał; Szufel, Przemysław

2018-01-01

In the paper, we consider sequential decision problems with uncertainty, represented as decision trees. Sensitivity analysis is always a crucial element of decision making and in decision trees it often focuses on probabilities. In the stochastic model considered, the user often has only limited information about the true values of probabilities. We develop a framework for performing sensitivity analysis of optimal strategies accounting for this distributional uncertainty. We design this robust optimization approach in an intuitive and not overly technical way, to make it simple to apply in daily managerial practice. The proposed framework allows for (1) analysis of the stability of the expected-value-maximizing strategy and (2) identification of strategies which are robust with respect to pessimistic/optimistic/mode-favoring perturbations of probabilities. We verify the properties of our approach in two cases: (a) probabilities in a tree are the primitives of the model and can be modified independently; (b) probabilities in a tree reflect some underlying, structural probabilities, and are interrelated. We provide a free software tool implementing the methods described.
Comparison of transform coding methods with an optimal predictor for the data compression of digital elevation models

NASA Technical Reports Server (NTRS)

Lewis, Michael

1994-01-01

Statistical encoding techniques enable the reduction of the number of bits required to encode a set of symbols, and are derived from their probabilities. Huffman encoding is an example of statistical encoding that has been used for error-free data compression. The degree of compression given by Huffman encoding in this application can be improved by the use of prediction methods. These replace the set of elevations by a set of corrections that have a more advantageous probability distribution. In particular, the method of Lagrange Multipliers for minimization of the mean square error has been applied to local geometrical predictors. Using this technique, an 8-point predictor achieved about a 7 percent improvement over an existing simple triangular predictor.
Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction

NASA Technical Reports Server (NTRS)

Cohen, A. C.

1971-01-01

A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.
Behavioral Analysis of Visitors to a Medical Institution’s Website Using Markov Chain Monte Carlo Methods

PubMed Central

Tani, Yuji

2016-01-01

Background Consistent with the “attention, interest, desire, memory, action” (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. Objective We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. Methods We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. Results In the case of the keyword “clinic name,” the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the keyword “clinic name and regional name,” the probability for a repeated visit to the website and the mammography screening page was negative. In the case of the keyword “clinic name + medical examination,” the visit probability to the website was positive, and the visit probability to the information page was negative. When visitors referred to the keywords “mammography screening,” the visit probability to the mammography screening page was positive (95% highest posterior density interval = 3.38-26.66). Conclusions Further analysis for not only the clinic website but also various other medical institution websites is necessary to build a general inspection model for medical institution websites; we want to consider this in future research. Additionally, we hope to use the results obtained in this study as a prior distribution for future work to conduct higher-precision analysis. PMID:27457537

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Lin; Dai, Zhenxue; Gong, Huili

Understanding the heterogeneity arising from the complex architecture of sedimentary sequences in alluvial fans is challenging. This study develops a statistical inverse framework in a multi-zone transition probability approach for characterizing the heterogeneity in alluvial fans. An analytical solution of the transition probability matrix is used to define the statistical relationships among different hydrofacies and their mean lengths, integral scales, and volumetric proportions. A statistical inversion is conducted to identify the multi-zone transition probability models and estimate the optimal statistical parameters using the modified Gauss–Newton–Levenberg–Marquardt method. The Jacobian matrix is computed by the sensitivity equation method, which results in anmore » accurate inverse solution with quantification of parameter uncertainty. We use the Chaobai River alluvial fan in the Beijing Plain, China, as an example for elucidating the methodology of alluvial fan characterization. The alluvial fan is divided into three sediment zones. In each zone, the explicit mathematical formulations of the transition probability models are constructed with optimized different integral scales and volumetric proportions. The hydrofacies distributions in the three zones are simulated sequentially by the multi-zone transition probability-based indicator simulations. Finally, the result of this study provides the heterogeneous structure of the alluvial fan for further study of flow and transport simulations.« less
Evolution of thermal stress and failure probability during reduction and re-oxidation of solid oxide fuel cell

NASA Astrophysics Data System (ADS)

Wang, Yu; Jiang, Wenchun; Luo, Yun; Zhang, Yucai; Tu, Shan-Tung

2017-12-01

The reduction and re-oxidation of anode have significant effects on the integrity of the solid oxide fuel cell (SOFC) sealed by the glass-ceramic (GC). The mechanical failure is mainly controlled by the stress distribution. Therefore, a three dimensional model of SOFC is established to investigate the stress evolution during the reduction and re-oxidation by finite element method (FEM) in this paper, and the failure probability is calculated using the Weibull method. The results demonstrate that the reduction of anode can decrease the thermal stresses and reduce the failure probability due to the volumetric contraction and porosity increasing. The re-oxidation can result in a remarkable increase of the thermal stresses, and the failure probabilities of anode, cathode, electrolyte and GC all increase to 1, which is mainly due to the large linear strain rather than the porosity decreasing. The cathode and electrolyte fail as soon as the linear strains are about 0.03% and 0.07%. Therefore, the re-oxidation should be controlled to ensure the integrity, and a lower re-oxidation temperature can decrease the stress and failure probability.
A new statistical method for characterizing the atmospheres of extrasolar planets

NASA Astrophysics Data System (ADS)

Henderson, Cassandra S.; Skemer, Andrew J.; Morley, Caroline V.; Fortney, Jonathan J.

2017-10-01

By detecting light from extrasolar planets, we can measure their compositions and bulk physical properties. The technologies used to make these measurements are still in their infancy, and a lack of self-consistency suggests that previous observations have underestimated their systemic errors. We demonstrate a statistical method, newly applied to exoplanet characterization, which uses a Bayesian formalism to account for underestimated errorbars. We use this method to compare photometry of a substellar companion, GJ 758b, with custom atmospheric models. Our method produces a probability distribution of atmospheric model parameters including temperature, gravity, cloud model (fsed) and chemical abundance for GJ 758b. This distribution is less sensitive to highly variant data and appropriately reflects a greater uncertainty on parameter fits.
Maximum Entropy Approach in Dynamic Contrast-Enhanced Magnetic Resonance Imaging.

PubMed

Farsani, Zahra Amini; Schmid, Volker J

2017-01-01

In the estimation of physiological kinetic parameters from Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) data, the determination of the arterial input function (AIF) plays a key role. This paper proposes a Bayesian method to estimate the physiological parameters of DCE-MRI along with the AIF in situations, where no measurement of the AIF is available. In the proposed algorithm, the maximum entropy method (MEM) is combined with the maximum a posterior approach (MAP). To this end, MEM is used to specify a prior probability distribution of the unknown AIF. The ability of this method to estimate the AIF is validated using the Kullback-Leibler divergence. Subsequently, the kinetic parameters can be estimated with MAP. The proposed algorithm is evaluated with a data set from a breast cancer MRI study. The application shows that the AIF can reliably be determined from the DCE-MRI data using MEM. Kinetic parameters can be estimated subsequently. The maximum entropy method is a powerful tool to reconstructing images from many types of data. This method is useful for generating the probability distribution based on given information. The proposed method gives an alternative way to assess the input function from the existing data. The proposed method allows a good fit of the data and therefore a better estimation of the kinetic parameters. In the end, this allows for a more reliable use of DCE-MRI. Schattauer GmbH.
Dosimetry for nonuniform activity distributions: a method for the calculation of 3D absorbed-dose distribution without the use of voxel S-values, point kernels, or Monte Carlo simulations.

PubMed

Traino, A C; Marcatili, S; Avigo, C; Sollini, M; Erba, P A; Mariani, G

2013-04-01

Nonuniform activity within the target lesions and the critical organs constitutes an important limitation for dosimetric estimates in patients treated with tumor-seeking radiopharmaceuticals. The tumor control probability and the normal tissue complication probability are affected by the distribution of the radionuclide in the treated organ/tissue. In this paper, a straightforward method for calculating the absorbed dose at the voxel level is described. This new method takes into account a nonuniform activity distribution in the target/organ. The new method is based on the macroscopic S-values (i.e., the S-values calculated for the various organs, as defined in the MIRD approach), on the definition of the number of voxels, and on the raw-count 3D array, corrected for attenuation, scatter, and collimator resolution, in the lesion/organ considered. Starting from these parameters, the only mathematical operation required is to multiply the 3D array by a scalar value, thus avoiding all the complex operations involving the 3D arrays. A comparison with the MIRD approach, fully described in the MIRD Pamphlet No. 17, using S-values at the voxel level, showed a good agreement between the two methods for (131)I and for (90)Y. Voxel dosimetry is becoming more and more important when performing therapy with tumor-seeking radiopharmaceuticals. The method presented here does not require calculating the S-values at the voxel level, and thus bypasses the mathematical problems linked to the convolution of 3D arrays and to the voxel size. In the paper, the results obtained with this new simplified method as well as the possibility of using it for other radionuclides commonly employed in therapy are discussed. The possibility of using the correct density value of the tissue/organs involved is also discussed.
Nuclear Forensics Analysis with Missing and Uncertain Data

DOE PAGES

Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent

2015-10-05

We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less
A novel method for the evaluation of uncertainty in dose-volume histogram computation.

PubMed

Henríquez, Francisco Cutanda; Castrillón, Silvia Vargas

2008-03-15

Dose-volume histograms (DVHs) are a useful tool in state-of-the-art radiotherapy treatment planning, and it is essential to recognize their limitations. Even after a specific dose-calculation model is optimized, dose distributions computed by using treatment-planning systems are affected by several sources of uncertainty, such as algorithm limitations, measurement uncertainty in the data used to model the beam, and residual differences between measured and computed dose. This report presents a novel method to take them into account. To take into account the effect of associated uncertainties, a probabilistic approach using a new kind of histogram, a dose-expected volume histogram, is introduced. The expected value of the volume in the region of interest receiving an absorbed dose equal to or greater than a certain value is found by using the probability distribution of the dose at each point. A rectangular probability distribution is assumed for this point dose, and a formulation that accounts for uncertainties associated with point dose is presented for practical computations. This method is applied to a set of DVHs for different regions of interest, including 6 brain patients, 8 lung patients, 8 pelvis patients, and 6 prostate patients planned for intensity-modulated radiation therapy. Results show a greater effect on planning target volume coverage than in organs at risk. In cases of steep DVH gradients, such as planning target volumes, this new method shows the largest differences with the corresponding DVH; thus, the effect of the uncertainty is larger.
Regional analysis of annual maximum rainfall using TL-moments method

NASA Astrophysics Data System (ADS)

Shabri, Ani Bin; Daud, Zalina Mohd; Ariff, Noratiqah Mohd

2011-06-01

Information related to distributions of rainfall amounts are of great importance for designs of water-related structures. One of the concerns of hydrologists and engineers is the probability distribution for modeling of regional data. In this study, a novel approach to regional frequency analysis using L-moments is revisited. Subsequently, an alternative regional frequency analysis using the TL-moments method is employed. The results from both methods were then compared. The analysis was based on daily annual maximum rainfall data from 40 stations in Selangor Malaysia. TL-moments for the generalized extreme value (GEV) and generalized logistic (GLO) distributions were derived and used to develop the regional frequency analysis procedure. TL-moment ratio diagram and Z-test were employed in determining the best-fit distribution. Comparison between the two approaches showed that the L-moments and TL-moments produced equivalent results. GLO and GEV distributions were identified as the most suitable distributions for representing the statistical properties of extreme rainfall in Selangor. Monte Carlo simulation was used for performance evaluation, and it showed that the method of TL-moments was more efficient for lower quantile estimation compared with the L-moments.
Methods of Information Geometry to model complex shapes

NASA Astrophysics Data System (ADS)

De Sanctis, A.; Gattone, S. A.

2016-09-01

In this paper, a new statistical method to model patterns emerging in complex systems is proposed. A framework for shape analysis of 2- dimensional landmark data is introduced, in which each landmark is represented by a bivariate Gaussian distribution. From Information Geometry we know that Fisher-Rao metric endows the statistical manifold of parameters of a family of probability distributions with a Riemannian metric. Thus this approach allows to reconstruct the intermediate steps in the evolution between observed shapes by computing the geodesic, with respect to the Fisher-Rao metric, between the corresponding distributions. Furthermore, the geodesic path can be used for shape predictions. As application, we study the evolution of the rat skull shape. A future application in Ophthalmology is introduced.
Alternative Derivations of the Statistical Mechanical Distribution Laws

PubMed Central

Wall, Frederick T.

1971-01-01

A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems. PMID:16578712
Alternative derivations of the statistical mechanical distribution laws.

PubMed

Wall, F T

1971-08-01

A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems.
Comparison of Deterministic and Probabilistic Radial Distribution Systems Load Flow

NASA Astrophysics Data System (ADS)

Gupta, Atma Ram; Kumar, Ashwani

2017-12-01

Distribution system network today is facing the challenge of meeting increased load demands from the industrial, commercial and residential sectors. The pattern of load is highly dependent on consumer behavior and temporal factors such as season of the year, day of the week or time of the day. For deterministic radial distribution load flow studies load is taken as constant. But, load varies continually with a high degree of uncertainty. So, there is a need to model probable realistic load. Monte-Carlo Simulation is used to model the probable realistic load by generating random values of active and reactive power load from the mean and standard deviation of the load and for solving a Deterministic Radial Load Flow with these values. The probabilistic solution is reconstructed from deterministic data obtained for each simulation. The main contribution of the work is: Finding impact of probable realistic ZIP load modeling on balanced radial distribution load flow. Finding impact of probable realistic ZIP load modeling on unbalanced radial distribution load flow. Compare the voltage profile and losses with probable realistic ZIP load modeling for balanced and unbalanced radial distribution load flow.
Probability Distribution of Turbulent Kinetic Energy Dissipation Rate in Ocean: Observations and Approximations

NASA Astrophysics Data System (ADS)

Lozovatsky, I.; Fernando, H. J. S.; Planella-Morato, J.; Liu, Zhiyu; Lee, J.-H.; Jinadasa, S. U. P.

2017-10-01

The probability distribution of turbulent kinetic energy dissipation rate in stratified ocean usually deviates from the classic lognormal distribution that has been formulated for and often observed in unstratified homogeneous layers of atmospheric and oceanic turbulence. Our measurements of vertical profiles of micro-scale shear, collected in the East China Sea, northern Bay of Bengal, to the south and east of Sri Lanka, and in the Gulf Stream region, show that the probability distributions of the dissipation rate ɛ˜r in the pycnoclines (r ˜ 1.4 m is the averaging scale) can be successfully modeled by the Burr (type XII) probability distribution. In weakly stratified boundary layers, lognormal distribution of ɛ˜r is preferable, although the Burr is an acceptable alternative. The skewness Skɛ and the kurtosis Kɛ of the dissipation rate appear to be well correlated in a wide range of Skɛ and Kɛ variability.
Overlapping clusters for distributed computation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mirrokni, Vahab; Andersen, Reid; Gleich, David F.

2010-11-01

Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initialmore » partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.« less
Bayesian approach to inverse statistical mechanics.

PubMed

Habeck, Michael

2014-05-01

Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
Bayesian approach to inverse statistical mechanics

NASA Astrophysics Data System (ADS)

Habeck, Michael

2014-05-01

Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.
New spatial upscaling methods for multi-point measurements: From normal to p-normal

NASA Astrophysics Data System (ADS)

Liu, Feng; Li, Xin

2017-12-01

Careful attention must be given to determining whether the geophysical variables of interest are normally distributed, since the assumption of a normal distribution may not accurately reflect the probability distribution of some variables. As a generalization of the normal distribution, the p-normal distribution and its corresponding maximum likelihood estimation (the least power estimation, LPE) were introduced in upscaling methods for multi-point measurements. Six methods, including three normal-based methods, i.e., arithmetic average, least square estimation, block kriging, and three p-normal-based methods, i.e., LPE, geostatistics LPE and inverse distance weighted LPE are compared in two types of experiments: a synthetic experiment to evaluate the performance of the upscaling methods in terms of accuracy, stability and robustness, and a real-world experiment to produce real-world upscaling estimates using soil moisture data obtained from multi-scale observations. The results show that the p-normal-based methods produced lower mean absolute errors and outperformed the other techniques due to their universality and robustness. We conclude that introducing appropriate statistical parameters into an upscaling strategy can substantially improve the estimation, especially if the raw measurements are disorganized; however, further investigation is required to determine which parameter is the most effective among variance, spatial correlation information and parameter p.
Evaluation of the Three Parameter Weibull Distribution Function for Predicting Fracture Probability in Composite Materials

DTIC Science & Technology

1978-03-01

for the risk of rupture for a unidirectionally laminat - ed composite subjected to pure bending. (5D This equation can be simplified further by use of...C EVALUATION OF THE THREE PARAMETER WEIBULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS. THESIS / AFIT/GAE...EVALUATION OF THE THREE PARAMETER WE1BULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS THESIS Presented
Ridit Analysis for Cooper-Harper and Other Ordinal Ratings for Sparse Data - A Distance-based Approach

DTIC Science & Technology

2016-09-01

is to fit empirical Beta distributions to observed data, and then to use a randomization approach to make inferences on the difference between...a Ridit analysis on the often sparse data sets in many Flying Qualities applicationsi. The method of this paper is to fit empirical Beta ...One such measure is the discrete- probability-distribution version of the (squared) ‘Hellinger Distance’ (Yang & Le Cam , 2000) 2(, ) = 1
Investigation into the Use of Normal and Half-Normal Plots for Interpreting Results from Screening Experiments.

DTIC Science & Technology

1987-03-25

by Lloyd (1952) using generalized least squares instead of ordinary least squares, and by Wilk, % 20 Gnanadesikan , and Freeny (1963) using a maximum...plot. The half-normal distribution is a special case of the gamma distribution proposed by Wilk, Gnanadesikan , and Huyett (1962). VARIATIONS ON THE... Gnanadesikan , R. Probability plotting methods for the analysis of data. Biometrika, 1968, 55, 1-17. This paper describes and discusses graphical techniques

Geospatial tools effectively estimate nonexceedance probabilities of daily streamflow at ungauged and intermittently gauged locations in Ohio

USGS Publications Warehouse

Farmer, William H.; Koltun, Greg

2017-01-01

Study regionThe state of Ohio in the United States, a humid, continental climate.Study focusThe estimation of nonexceedance probabilities of daily streamflows as an alternative means of establishing the relative magnitudes of streamflows associated with hydrologic and water-quality observations.New hydrological insights for the regionSeveral methods for estimating nonexceedance probabilities of daily mean streamflows are explored, including single-index methodologies (nearest-neighboring index) and geospatial tools (kriging and topological kriging). These methods were evaluated by conducting leave-one-out cross-validations based on analyses of nearly 7 years of daily streamflow data from 79 unregulated streamgages in Ohio and neighboring states. The pooled, ordinary kriging model, with a median Nash–Sutcliffe performance of 0.87, was superior to the single-site index methods, though there was some bias in the tails of the probability distribution. Incorporating network structure through topological kriging did not improve performance. The pooled, ordinary kriging model was applied to 118 locations without systematic streamgaging across Ohio where instantaneous streamflow measurements had been made concurrent with water-quality sampling on at least 3 separate days. Spearman rank correlations between estimated nonexceedance probabilities and measured streamflows were high, with a median value of 0.76. In consideration of application, the degree of regulation in a set of sample sites helped to specify the streamgages required to implement kriging approaches successfully.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Goldstein, Adam; Connaughton, Valerie; Briggs, Michael S.

We present a method to estimate the jet opening angles of long duration gamma-ray bursts (GRBs) using the prompt gamma-ray energetics and an inversion of the Ghirlanda relation, which is a correlation between the time-integrated peak energy of the GRB prompt spectrum and the collimation-corrected energy in gamma-rays. The derived jet opening angles using this method and detailed assumptions match well with the corresponding inferred jet opening angles obtained when a break in the afterglow is observed. Furthermore, using a model of the predicted long GRB redshift probability distribution observable by the Fermi Gamma-ray Burst Monitor (GBM), we estimate themore » probability distributions for the jet opening angle and rest-frame energetics for a large sample of GBM GRBs for which the redshifts have not been observed. Previous studies have only used a handful of GRBs to estimate these properties due to the paucity of observed afterglow jet breaks, spectroscopic redshifts, and comprehensive prompt gamma-ray observations, and we potentially expand the number of GRBs that can be used in this analysis by more than an order of magnitude. In this analysis, we also present an inferred distribution of jet breaks which indicates that a large fraction of jet breaks are not observable with current instrumentation and observing strategies. We present simple parameterizations for the jet angle, energetics, and jet break distributions so that they may be used in future studies.« less
A study of the application of power-spectral methods of generalized harmonic analysis to gust loads on airplanes

NASA Technical Reports Server (NTRS)

Press, Harry; Mazelsky, Bernard

1954-01-01

The applicability of some results from the theory of generalized harmonic analysis (or power-spectral analysis) to the analysis of gust loads on airplanes in continuous rough air is examined. The general relations for linear systems between power spectrums of a random input disturbance and an output response are used to relate the spectrum of airplane load in rough air to the spectrum of atmospheric gust velocity. The power spectrum of loads is shown to provide a measure of the load intensity in terms of the standard deviation (root mean square) of the load distribution for an airplane in flight through continuous rough air. For the case of a load output having a normal distribution, which appears from experimental evidence to apply to homogeneous rough air, the standard deviation is shown to describe the probability distribution of loads or the proportion of total time that the load has given values. Thus, for airplane in flight through homogeneous rough air, the probability distribution of loads may be determined from a power-spectral analysis. In order to illustrate the application of power-spectral analysis to gust-load analysis and to obtain an insight into the relations between loads and airplane gust-response characteristics, two selected series of calculations are presented. The results indicate that both methods of analysis yield results that are consistent to a first approximation.
Spatial distribution of sand fly species (Psychodidae: Phlebtominae), ecological niche, and climatic regionalization in zoonotic foci of cutaneous leishmaniasis, southwest of Iran.

PubMed

Ebrahimi, Sahar; Bordbar, Ali; Rastaghi, Ahmad R Esmaeili; Parvizi, Parviz

2016-06-01

Cutaneous leishmaniasis (CL) is a complex vector-borne disease caused by Leishmania parasites that are transmitted by the bite of several species of infected female phlebotomine sand flies. Monthly factor analysis of climatic variables indicated fundamental variables. Principal component-based regionalization was used for recognition of climatic zones using a clustering integrated method that identified five climatic zones based on factor analysis. To investigate spatial distribution of the sand fly species, the kriging method was used as an advanced geostatistical procedure in the ArcGIS modeling system that is beneficial to design measurement plans and to predict the transmission cycle in various regions of Khuzestan province, southwest of Iran. However, more than an 80% probability of P. papatasi was observed in rainy and temperate bio-climatic zones with a high potential of CL transmission. Finding P. sergenti revealed the probability of transmission and distribution patterns of a non-native vector of CL in related zones. These findings could be used as models indicating climatic zones and environmental variables connected to sand fly presence and vector distribution. Furthermore, this information is appropriate for future research efforts into the ecology of Phlebotomine sand flies and for the prevention of CL vector transmission as a public health priority. © 2016 The Society for Vector Ecology.
Probabilistic approach to lysozyme crystal nucleation kinetics.

PubMed

Dimitrov, Ivaylo L; Hodzhaoglu, Feyzim V; Koleva, Dobryana P

2015-09-01

Nucleation of lysozyme crystals in quiescent solutions at a regime of progressive nucleation is investigated under an optical microscope at conditions of constant supersaturation. A method based on the stochastic nature of crystal nucleation and using discrete time sampling of small solution volumes for the presence or absence of detectable crystals is developed. It allows probabilities for crystal detection to be experimentally estimated. One hundred single samplings were used for each probability determination for 18 time intervals and six lysozyme concentrations. Fitting of a particular probability function to experimentally obtained data made possible the direct evaluation of stationary rates for lysozyme crystal nucleation, the time for growth of supernuclei to a detectable size and probability distribution of nucleation times. Obtained stationary nucleation rates were then used for the calculation of other nucleation parameters, such as the kinetic nucleation factor, nucleus size, work for nucleus formation and effective specific surface energy of the nucleus. The experimental method itself is simple and adaptable and can be used for crystal nucleation studies of arbitrary soluble substances with known solubility at particular solution conditions.
Comparison of Bootstrapping and Markov Chain Monte Carlo for Copula Analysis of Hydrological Droughts

NASA Astrophysics Data System (ADS)

Yang, P.; Ng, T. L.; Yang, W.

2015-12-01

Effective water resources management depends on the reliable estimation of the uncertainty of drought events. Confidence intervals (CIs) are commonly applied to quantify this uncertainty. A CI seeks to be at the minimal length necessary to cover the true value of the estimated variable with the desired probability. In drought analysis where two or more variables (e.g., duration and severity) are often used to describe a drought, copulas have been found suitable for representing the joint probability behavior of these variables. However, the comprehensive assessment of the parameter uncertainties of copulas of droughts has been largely ignored, and the few studies that have recognized this issue have not explicitly compared the various methods to produce the best CIs. Thus, the objective of this study to compare the CIs generated using two widely applied uncertainty estimation methods, bootstrapping and Markov Chain Monte Carlo (MCMC). To achieve this objective, (1) the marginal distributions lognormal, Gamma, and Generalized Extreme Value, and the copula functions Clayton, Frank, and Plackett are selected to construct joint probability functions of two drought related variables. (2) The resulting joint functions are then fitted to 200 sets of simulated realizations of drought events with known distribution and extreme parameters and (3) from there, using bootstrapping and MCMC, CIs of the parameters are generated and compared. The effect of an informative prior on the CIs generated by MCMC is also evaluated. CIs are produced for different sample sizes (50, 100, and 200) of the simulated drought events for fitting the joint probability functions. Preliminary results assuming lognormal marginal distributions and the Clayton copula function suggest that for cases with small or medium sample sizes (~50-100), MCMC to be superior method if an informative prior exists. Where an informative prior is unavailable, for small sample sizes (~50), both bootstrapping and MCMC yield the same level of performance, and for medium sample sizes (~100), bootstrapping is better. For cases with a large sample size (~200), there is little difference between the CIs generated using bootstrapping and MCMC regardless of whether or not an informative prior exists.
A Deterministic Annealing Approach to Clustering AIRS Data

NASA Technical Reports Server (NTRS)

Guillaume, Alexandre; Braverman, Amy; Ruzmaikin, Alexander

2012-01-01

We will examine the validity of means and standard deviations as a basis for climate data products. We will explore the conditions under which these two simple statistics are inadequate summaries of the underlying empirical probability distributions by contrasting them with a nonparametric, method called Deterministic Annealing technique
A procedure for estimating the frequency distribution of CO levels in the micro-region of a highway.

DOT National Transportation Integrated Search

1979-01-01

This report demonstrates that the probability of violating a "not to be exceeded more than once per year", one-hour air quality standard can be bounded from above. This result represents a significant improvement over previous methods of ascertaining...
A Bayesian Method for Managing Uncertainties Relating to Distributed Multistatic Sensor Search

DTIC Science & Technology

2006-07-01

before - detect process. There will also be an increased probability of high signal-to-noise ratio (SNR) detections associated with specular and near...and high target strength and high Doppler opportunities give rise to the expectation of an increased number of detections that could feed a track
IMPLICATIONS OF USING ROBUST BAYESIAN ANALYSIS TO REPRESENT DIVERSE SOURCES OF UNCERTAINTY IN INTEGRATED ASSESSMENT

EPA Science Inventory

In our previous research, we showed that robust Bayesian methods can be used in environmental modeling to define a set of probability distributions for key parameters that captures the effects of expert disagreement, ambiguity, or ignorance. This entire set can then be update...
Bivariate extreme value distributions

NASA Technical Reports Server (NTRS)

Elshamy, M.

1992-01-01

In certain engineering applications, such as those occurring in the analyses of ascent structural loads for the Space Transportation System (STS), some of the load variables have a lower bound of zero. Thus, the need for practical models of bivariate extreme value probability distribution functions with lower limits was identified. We discuss the Gumbel models and present practical forms of bivariate extreme probability distributions of Weibull and Frechet types with two parameters. Bivariate extreme value probability distribution functions can be expressed in terms of the marginal extremel distributions and a 'dependence' function subject to certain analytical conditions. Properties of such bivariate extreme distributions, sums and differences of paired extremals, as well as the corresponding forms of conditional distributions, are discussed. Practical estimation techniques are also given.
A global logrank test for adaptive treatment strategies based on observational studies.

PubMed

Li, Zhiguo; Valenstein, Marcia; Pfeiffer, Paul; Ganoczy, Dara

2014-02-28

In studying adaptive treatment strategies, a natural question that is of paramount interest is whether there is any significant difference among all possible treatment strategies. When the outcome variable of interest is time-to-event, we propose an inverse probability weighted logrank test for testing the equivalence of a fixed set of pre-specified adaptive treatment strategies based on data from an observational study. The weights take into account both the possible selection bias in an observational study and the fact that the same subject may be consistent with more than one treatment strategy. The asymptotic distribution of the weighted logrank statistic under the null hypothesis is obtained. We show that, in an observational study where the treatment selection probabilities need to be estimated, the estimation of these probabilities does not have an effect on the asymptotic distribution of the weighted logrank statistic, as long as the estimation of the parameters in the models for these probabilities is n-consistent. Finite sample performance of the test is assessed via a simulation study. We also show in the simulation that the test can be pretty robust to misspecification of the models for the probabilities of treatment selection. The method is applied to analyze data on antidepressant adherence time from an observational database maintained at the Department of Veterans Affairs' Serious Mental Illness Treatment Research and Evaluation Center. Copyright © 2013 John Wiley & Sons, Ltd.
A nonparametric method to generate synthetic populations to adjust for complex sampling design features.

PubMed

Dong, Qi; Elliott, Michael R; Raghunathan, Trivellore E

2014-06-01

Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.
A nonparametric method to generate synthetic populations to adjust for complex sampling design features

PubMed Central

Dong, Qi; Elliott, Michael R.; Raghunathan, Trivellore E.

2017-01-01

Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs. PMID:29200608
Eruption probabilities for the Lassen Volcanic Center and regional volcanism, northern California, and probabilities for large explosive eruptions in the Cascade Range

USGS Publications Warehouse

Nathenson, Manuel; Clynne, Michael A.; Muffler, L.J. Patrick

2012-01-01

Chronologies for eruptive activity of the Lassen Volcanic Center and for eruptions from the regional mafic vents in the surrounding area of the Lassen segment of the Cascade Range are here used to estimate probabilities of future eruptions. For the regional mafic volcanism, the ages of many vents are known only within broad ranges, and two models are developed that should bracket the actual eruptive ages. These chronologies are used with exponential, Weibull, and mixed-exponential probability distributions to match the data for time intervals between eruptions. For the Lassen Volcanic Center, the probability of an eruption in the next year is 1.4x10-4 for the exponential distribution and 2.3x10-4 for the mixed exponential distribution. For the regional mafic vents, the exponential distribution gives a probability of an eruption in the next year of 6.5x10-4, but the mixed exponential distribution indicates that the current probability, 12,000 years after the last event, could be significantly lower. For the exponential distribution, the highest probability is for an eruption from a regional mafic vent. Data on areas and volumes of lava flows and domes of the Lassen Volcanic Center and of eruptions from the regional mafic vents provide constraints on the probable sizes of future eruptions. Probabilities of lava-flow coverage are similar for the Lassen Volcanic Center and for regional mafic vents, whereas the probable eruptive volumes for the mafic vents are generally smaller. Data have been compiled for large explosive eruptions (>≈ 5 km3 in deposit volume) in the Cascade Range during the past 1.2 m.y. in order to estimate probabilities of eruption. For erupted volumes >≈5 km3, the rate of occurrence since 13.6 ka is much higher than for the entire period, and we use these data to calculate the annual probability of a large eruption at 4.6x10-4. For erupted volumes ≥10 km3, the rate of occurrence has been reasonably constant from 630 ka to the present, giving more confidence in the estimate, and we use those data to calculate the annual probability of a large eruption in the next year at 1.4x10-5.
Steady state, relaxation and first-passage properties of a run-and-tumble particle in one-dimension

NASA Astrophysics Data System (ADS)

Malakar, Kanaya; Jemseena, V.; Kundu, Anupam; Vijay Kumar, K.; Sabhapandit, Sanjib; Majumdar, Satya N.; Redner, S.; Dhar, Abhishek

2018-04-01

We investigate the motion of a run-and-tumble particle (RTP) in one dimension. We find the exact probability distribution of the particle with and without diffusion on the infinite line, as well as in a finite interval. In the infinite domain, this probability distribution approaches a Gaussian form in the long-time limit, as in the case of a regular Brownian particle. At intermediate times, this distribution exhibits unexpected multi-modal forms. In a finite domain, the probability distribution reaches a steady-state form with peaks at the boundaries, in contrast to a Brownian particle. We also study the relaxation to the steady-state analytically. Finally we compute the survival probability of the RTP in a semi-infinite domain with an absorbing boundary condition at the origin. In the finite interval, we compute the exit probability and the associated exit times. We provide numerical verification of our analytical results.
DOE Office of Scientific and Technical Information (OSTI.GOV)

La Russa, D

Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR)

PubMed Central

Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

2007-01-01

Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100
Delineating Facies Spatial Distribution by Integrating Ensemble Data Assimilation and Indicator Geostatistics with Level Set Transformation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hammond, Glenn Edward; Song, Xuehang; Ye, Ming

A new approach is developed to delineate the spatial distribution of discrete facies (geological units that have unique distributions of hydraulic, physical, and/or chemical properties) conditioned not only on direct data (measurements directly related to facies properties, e.g., grain size distribution obtained from borehole samples) but also on indirect data (observations indirectly related to facies distribution, e.g., hydraulic head and tracer concentration). Our method integrates for the first time ensemble data assimilation with traditional transition probability-based geostatistics. The concept of level set is introduced to build shape parameterization that allows transformation between discrete facies indicators and continuous random variables. Themore » spatial structure of different facies is simulated by indicator models using conditioning points selected adaptively during the iterative process of data assimilation. To evaluate the new method, a two-dimensional semi-synthetic example is designed to estimate the spatial distribution and permeability of two distinct facies from transient head data induced by pumping tests. The example demonstrates that our new method adequately captures the spatial pattern of facies distribution by imposing spatial continuity through conditioning points. The new method also reproduces the overall response in hydraulic head field with better accuracy compared to data assimilation with no constraints on spatial continuity on facies.« less
Fitness Probability Distribution of Bit-Flip Mutation.

PubMed

Chicano, Francisco; Sutton, Andrew M; Whitley, L Darrell; Alba, Enrique

2015-01-01

Bit-flip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bit-flip mutation. We prove that this probability distribution can be expressed as a polynomial in p, the probability of flipping each bit. We analyze these polynomials and provide closed-form expressions for an easy linear problem (Onemax), and an NP-hard problem, MAX-SAT. We also discuss a connection of the results with runtime analysis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.