Sample records for multivariate probability distributions

  1. Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li Yupeng, E-mail: yupeng@ualberta.ca; Deutsch, Clayton V.

    2012-06-15

    In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells.more » In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.« less

  2. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution

    PubMed Central

    Dinov, Ivo D.; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2014-01-01

    Summary Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students’ understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference. PMID:25419016

  3. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution.

    PubMed

    Dinov, Ivo D; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2013-01-01

    Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students' understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference.

  4. Multivariate η-μ fading distribution with arbitrary correlation model

    NASA Astrophysics Data System (ADS)

    Ghareeb, Ibrahim; Atiani, Amani

    2018-03-01

    An extensive analysis for the multivariate ? distribution with arbitrary correlation is presented, where novel analytical expressions for the multivariate probability density function, cumulative distribution function and moment generating function (MGF) of arbitrarily correlated and not necessarily identically distributed ? power random variables are derived. Also, this paper provides exact-form expression for the MGF of the instantaneous signal-to-noise ratio at the combiner output in a diversity reception system with maximal-ratio combining and post-detection equal-gain combining operating in slow frequency nonselective arbitrarily correlated not necessarily identically distributed ?-fading channels. The average bit error probability of differentially detected quadrature phase shift keying signals with post-detection diversity reception system over arbitrarily correlated and not necessarily identical fading parameters ?-fading channels is determined by using the MGF-based approach. The effect of fading correlation between diversity branches, fading severity parameters and diversity level is studied.

  5. Approximating Multivariate Normal Orthant Probabilities. ONR Technical Report. [Biometric Lab Report No. 90-1.

    ERIC Educational Resources Information Center

    Gibbons, Robert D.; And Others

    The probability integral of the multivariate normal distribution (ND) has received considerable attention since W. F. Sheppard's (1900) and K. Pearson's (1901) seminal work on the bivariate ND. This paper evaluates the formula that represents the "n x n" correlation matrix of the "chi(sub i)" and the standardized multivariate…

  6. Vector wind and vector wind shear models 0 to 27 km altitude for Cape Kennedy, Florida, and Vandenberg AFB, California

    NASA Technical Reports Server (NTRS)

    Smith, O. E.

    1976-01-01

    The techniques are presented to derive several statistical wind models. The techniques are from the properties of the multivariate normal probability function. Assuming that the winds can be considered as bivariate normally distributed, then (1) the wind components and conditional wind components are univariate normally distributed, (2) the wind speed is Rayleigh distributed, (3) the conditional distribution of wind speed given a wind direction is Rayleigh distributed, and (4) the frequency of wind direction can be derived. All of these distributions are derived from the 5-sample parameter of wind for the bivariate normal distribution. By further assuming that the winds at two altitudes are quadravariate normally distributed, then the vector wind shear is bivariate normally distributed and the modulus of the vector wind shear is Rayleigh distributed. The conditional probability of wind component shears given a wind component is normally distributed. Examples of these and other properties of the multivariate normal probability distribution function as applied to Cape Kennedy, Florida, and Vandenberg AFB, California, wind data samples are given. A technique to develop a synthetic vector wind profile model of interest to aerospace vehicle applications is presented.

  7. Positive phase space distributions and uncertainty relations

    NASA Technical Reports Server (NTRS)

    Kruger, Jan

    1993-01-01

    In contrast to a widespread belief, Wigner's theorem allows the construction of true joint probabilities in phase space for distributions describing the object system as well as for distributions depending on the measurement apparatus. The fundamental role of Heisenberg's uncertainty relations in Schroedinger form (including correlations) is pointed out for these two possible interpretations of joint probability distributions. Hence, in order that a multivariate normal probability distribution in phase space may correspond to a Wigner distribution of a pure or a mixed state, it is necessary and sufficient that Heisenberg's uncertainty relation in Schroedinger form should be satisfied.

  8. ProbOnto: ontology and knowledge base of probability distributions.

    PubMed

    Swat, Maciej J; Grenon, Pierre; Wimalaratne, Sarala

    2016-09-01

    Probability distributions play a central role in mathematical and statistical modelling. The encoding, annotation and exchange of such models could be greatly simplified by a resource providing a common reference for the definition of probability distributions. Although some resources exist, no suitably detailed and complex ontology exists nor any database allowing programmatic access. ProbOnto, is an ontology-based knowledge base of probability distributions, featuring more than 80 uni- and multivariate distributions with their defining functions, characteristics, relationships and re-parameterization formulas. It can be used for model annotation and facilitates the encoding of distribution-based models, related functions and quantities. http://probonto.org mjswat@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  9. Generating an Empirical Probability Distribution for the Andrews-Pregibon Statistic.

    ERIC Educational Resources Information Center

    Jarrell, Michele G.

    A probability distribution was developed for the Andrews-Pregibon (AP) statistic. The statistic, developed by D. F. Andrews and D. Pregibon (1978), identifies multivariate outliers. It is a ratio of the determinant of the data matrix with an observation deleted to the determinant of the entire data matrix. Although the AP statistic has been used…

  10. Exploring image data assimilation in the prospect of high-resolution satellite oceanic observations

    NASA Astrophysics Data System (ADS)

    Durán Moro, Marina; Brankart, Jean-Michel; Brasseur, Pierre; Verron, Jacques

    2017-07-01

    Satellite sensors increasingly provide high-resolution (HR) observations of the ocean. They supply observations of sea surface height (SSH) and of tracers of the dynamics such as sea surface salinity (SSS) and sea surface temperature (SST). In particular, the Surface Water Ocean Topography (SWOT) mission will provide measurements of the surface ocean topography at very high-resolution (HR) delivering unprecedented information on the meso-scale and submeso-scale dynamics. This study investigates the feasibility to use these measurements to reconstruct meso-scale features simulated by numerical models, in particular on the vertical dimension. A methodology to reconstruct three-dimensional (3D) multivariate meso-scale scenes is developed by using a HR numerical model of the Solomon Sea region. An inverse problem is defined in the framework of a twin experiment where synthetic observations are used. A true state is chosen among the 3D multivariate states which is considered as a reference state. In order to correct a first guess of this true state, a two-step analysis is carried out. A probability distribution of the first guess is defined and updated at each step of the analysis: (i) the first step applies the analysis scheme of a reduced-order Kalman filter to update the first guess probability distribution using SSH observation; (ii) the second step minimizes a cost function using observations of HR image structure and a new probability distribution is estimated. The analysis is extended to the vertical dimension using 3D multivariate empirical orthogonal functions (EOFs) and the probabilistic approach allows the update of the probability distribution through the two-step analysis. Experiments show that the proposed technique succeeds in correcting a multivariate state using meso-scale and submeso-scale information contained in HR SSH and image structure observations. It also demonstrates how the surface information can be used to reconstruct the ocean state below the surface.

  11. Application of multivariate Gaussian detection theory to known non-Gaussian probability density functions

    NASA Astrophysics Data System (ADS)

    Schwartz, Craig R.; Thelen, Brian J.; Kenton, Arthur C.

    1995-06-01

    A statistical parametric multispectral sensor performance model was developed by ERIM to support mine field detection studies, multispectral sensor design/performance trade-off studies, and target detection algorithm development. The model assumes target detection algorithms and their performance models which are based on data assumed to obey multivariate Gaussian probability distribution functions (PDFs). The applicability of these algorithms and performance models can be generalized to data having non-Gaussian PDFs through the use of transforms which convert non-Gaussian data to Gaussian (or near-Gaussian) data. An example of one such transform is the Box-Cox power law transform. In practice, such a transform can be applied to non-Gaussian data prior to the introduction of a detection algorithm that is formally based on the assumption of multivariate Gaussian data. This paper presents an extension of these techniques to the case where the joint multivariate probability density function of the non-Gaussian input data is known, and where the joint estimate of the multivariate Gaussian statistics, under the Box-Cox transform, is desired. The jointly estimated multivariate Gaussian statistics can then be used to predict the performance of a target detection algorithm which has an associated Gaussian performance model.

  12. Multivariate hydrological frequency analysis for extreme events using Archimedean copula. Case study: Lower Tunjuelo River basin (Colombia)

    NASA Astrophysics Data System (ADS)

    Gómez, Wilmar

    2017-04-01

    By analyzing the spatial and temporal variability of extreme precipitation events we can prevent or reduce the threat and risk. Many water resources projects require joint probability distributions of random variables such as precipitation intensity and duration, which can not be independent with each other. The problem of defining a probability model for observations of several dependent variables is greatly simplified by the joint distribution in terms of their marginal by taking copulas. This document presents a general framework set frequency analysis bivariate and multivariate using Archimedean copulas for extreme events of hydroclimatological nature such as severe storms. This analysis was conducted in the lower Tunjuelo River basin in Colombia for precipitation events. The results obtained show that for a joint study of the intensity-duration-frequency, IDF curves can be obtained through copulas and thus establish more accurate and reliable information from design storms and associated risks. It shows how the use of copulas greatly simplifies the study of multivariate distributions that introduce the concept of joint return period used to represent the needs of hydrological designs properly in frequency analysis.

  13. A Simpli ed, General Approach to Simulating from Multivariate Copula Functions

    Treesearch

    Barry Goodwin

    2012-01-01

    Copulas have become an important analytic tool for characterizing multivariate distributions and dependence. One is often interested in simulating data from copula estimates. The process can be analytically and computationally complex and usually involves steps that are unique to a given parametric copula. We describe an alternative approach that uses \\probability{...

  14. Metocean design parameter estimation for fixed platform based on copula functions

    NASA Astrophysics Data System (ADS)

    Zhai, Jinjin; Yin, Qilin; Dong, Sheng

    2017-08-01

    Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.

  15. A note on a simplified and general approach to simulating from multivariate copula functions

    Treesearch

    Barry K. Goodwin

    2013-01-01

    Copulas have become an important analytic tool for characterizing multivariate distributions and dependence. One is often interested in simulating data from copula estimates. The process can be analytically and computationally complex and usually involves steps that are unique to a given parametric copula. We describe an alternative approach that uses ‘Probability-...

  16. Probabilistic estimates of drought impacts on agricultural production

    NASA Astrophysics Data System (ADS)

    Madadgar, Shahrbanou; AghaKouchak, Amir; Farahmand, Alireza; Davis, Steven J.

    2017-08-01

    Increases in the severity and frequency of drought in a warming climate may negatively impact agricultural production and food security. Unlike previous studies that have estimated agricultural impacts of climate condition using single-crop yield distributions, we develop a multivariate probabilistic model that uses projected climatic conditions (e.g., precipitation amount or soil moisture) throughout a growing season to estimate the probability distribution of crop yields. We demonstrate the model by an analysis of the historical period 1980-2012, including the Millennium Drought in Australia (2001-2009). We find that precipitation and soil moisture deficit in dry growing seasons reduced the average annual yield of the five largest crops in Australia (wheat, broad beans, canola, lupine, and barley) by 25-45% relative to the wet growing seasons. Our model can thus produce region- and crop-specific agricultural sensitivities to climate conditions and variability. Probabilistic estimates of yield may help decision-makers in government and business to quantitatively assess the vulnerability of agriculture to climate variations. We develop a multivariate probabilistic model that uses precipitation to estimate the probability distribution of crop yields. The proposed model shows how the probability distribution of crop yield changes in response to droughts. During Australia's Millennium Drought precipitation and soil moisture deficit reduced the average annual yield of the five largest crops.

  17. Multivariate quantile mapping bias correction: an N-dimensional probability density function transform for climate model simulations of multiple variables

    NASA Astrophysics Data System (ADS)

    Cannon, Alex J.

    2018-01-01

    Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.

  18. Multivariate flood risk assessment: reinsurance perspective

    NASA Astrophysics Data System (ADS)

    Ghizzoni, Tatiana; Ellenrieder, Tobias

    2013-04-01

    For insurance and re-insurance purposes the knowledge of the spatial characteristics of fluvial flooding is fundamental. The probability of simultaneous flooding at different locations during one event and the associated severity and losses have to be estimated in order to assess premiums and for accumulation control (Probable Maximum Losses calculation). Therefore, the identification of a statistical model able to describe the multivariate joint distribution of flood events in multiple location is necessary. In this context, copulas can be viewed as alternative tools for dealing with multivariate simulations as they allow to formalize dependence structures of random vectors. An application of copula function for flood scenario generation is presented for Australia (Queensland, New South Wales and Victoria) where 100.000 possible flood scenarios covering approximately 15.000 years were simulated.

  19. Kullback-Leibler information function and the sequential selection of experiments to discriminate among several linear models

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    The error variance of the process prior multivariate normal distributions of the parameters of the models are assumed to be specified, prior probabilities of the models being correct. A rule for termination of sampling is proposed. Upon termination, the model with the largest posterior probability is chosen as correct. If sampling is not terminated, posterior probabilities of the models and posterior distributions of the parameters are computed. An experiment was chosen to maximize the expected Kullback-Leibler information function. Monte Carlo simulation experiments were performed to investigate large and small sample behavior of the sequential adaptive procedure.

  20. A new subgrid-scale representation of hydrometeor fields using a multivariate PDF

    DOE PAGES

    Griffin, Brian M.; Larson, Vincent E.

    2016-06-03

    The subgrid-scale representation of hydrometeor fields is important for calculating microphysical process rates. In order to represent subgrid-scale variability, the Cloud Layers Unified By Binormals (CLUBB) parameterization uses a multivariate probability density function (PDF). In addition to vertical velocity, temperature, and moisture fields, the PDF includes hydrometeor fields. Previously, hydrometeor fields were assumed to follow a multivariate single lognormal distribution. Now, in order to better represent the distribution of hydrometeors, two new multivariate PDFs are formulated and introduced.The new PDFs represent hydrometeors using either a delta-lognormal or a delta-double-lognormal shape. The two new PDF distributions, plus the previous single lognormalmore » shape, are compared to histograms of data taken from large-eddy simulations (LESs) of a precipitating cumulus case, a drizzling stratocumulus case, and a deep convective case. In conclusion, the warm microphysical process rates produced by the different hydrometeor PDFs are compared to the same process rates produced by the LES.« less

  1. Measures of dependence for multivariate Lévy distributions

    NASA Astrophysics Data System (ADS)

    Boland, J.; Hurd, T. R.; Pivato, M.; Seco, L.

    2001-02-01

    Recent statistical analysis of a number of financial databases is summarized. Increasing agreement is found that logarithmic equity returns show a certain type of asymptotic behavior of the largest events, namely that the probability density functions have power law tails with an exponent α≈3.0. This behavior does not vary much over different stock exchanges or over time, despite large variations in trading environments. The present paper proposes a class of multivariate distributions which generalizes the observed qualities of univariate time series. A new consequence of the proposed class is the "spectral measure" which completely characterizes the multivariate dependences of the extreme tails of the distribution. This measure on the unit sphere in M-dimensions, in principle completely general, can be determined empirically by looking at extreme events. If it can be observed and determined, it will prove to be of importance for scenario generation in portfolio risk management.

  2. Application of Maxent Multivariate Analysis to Define Climate-Change Effects on Species Distributions and Changes

    DTIC Science & Technology

    2014-09-01

    approaches. Ecological Modelling Volume 200, Issues 1–2, 10, pp 1–19. Buhlmann, Kurt A ., Thomas S.B. Akre , John B. Iverson, Deno Karapatakis, Russell A ...statistical multivariate analysis to define the current and projected future range probability for species of interest to Army land managers. A software...15 Figure 4. RCW omission rate and predicted area as a function of the cumulative threshold

  3. Probability distributions for multimeric systems.

    PubMed

    Albert, Jaroslav; Rooman, Marianne

    2016-01-01

    We propose a fast and accurate method of obtaining the equilibrium mono-modal joint probability distributions for multimeric systems. The method necessitates only two assumptions: the copy number of all species of molecule may be treated as continuous; and, the probability density functions (pdf) are well-approximated by multivariate skew normal distributions (MSND). Starting from the master equation, we convert the problem into a set of equations for the statistical moments which are then expressed in terms of the parameters intrinsic to the MSND. Using an optimization package on Mathematica, we minimize a Euclidian distance function comprising of a sum of the squared difference between the left and the right hand sides of these equations. Comparison of results obtained via our method with those rendered by the Gillespie algorithm demonstrates our method to be highly accurate as well as efficient.

  4. Probabilistic sensitivity analysis for decision trees with multiple branches: use of the Dirichlet distribution in a Bayesian framework.

    PubMed

    Briggs, Andrew H; Ades, A E; Price, Martin J

    2003-01-01

    In structuring decision models of medical interventions, it is commonly recommended that only 2 branches be used for each chance node to avoid logical inconsistencies that can arise during sensitivity analyses if the branching probabilities do not sum to 1. However, information may be naturally available in an unconditional form, and structuring a tree in conditional form may complicate rather than simplify the sensitivity analysis of the unconditional probabilities. Current guidance emphasizes using probabilistic sensitivity analysis, and a method is required to provide probabilistic probabilities over multiple branches that appropriately represents uncertainty while satisfying the requirement that mutually exclusive event probabilities should sum to 1. The authors argue that the Dirichlet distribution, the multivariate equivalent of the beta distribution, is appropriate for this purpose and illustrate its use for generating a fully probabilistic transition matrix for a Markov model. Furthermore, they demonstrate that by adopting a Bayesian approach, the problem of observing zero counts for transitions of interest can be overcome.

  5. Technical Reports Prepared Under Contract N00014-76-C-0475.

    DTIC Science & Technology

    1987-05-29

    264 Approximations to Densities in Geometric H. Solomon 10/27/78 Probability M.A. Stephens 3. Technical Relort No. Title Author Date 265 Sequential ...Certain Multivariate S. Iyengar 8/12/82 Normal Probabilities 323 EDF Statistics for Testing for the Gamma M.A. Stephens 8/13/82 Distribution with...20-85 Nets 360 Random Sequential Coding By Hamming Distance Yoshiaki Itoh 07-11-85 Herbert Solomon 361 Transforming Censored Samples And Testing Fit

  6. Inference for the Bivariate and Multivariate Hidden Truncated Pareto(type II) and Pareto(type IV) Distribution and Some Measures of Divergence Related to Incompatibility of Probability Distribution

    ERIC Educational Resources Information Center

    Ghosh, Indranil

    2011-01-01

    Consider a discrete bivariate random variable (X, Y) with possible values x[subscript 1], x[subscript 2],..., x[subscript I] for X and y[subscript 1], y[subscript 2],..., y[subscript J] for Y. Further suppose that the corresponding families of conditional distributions, for X given values of Y and of Y for given values of X are available. We…

  7. Multidimensional stochastic approximation using locally contractive functions

    NASA Technical Reports Server (NTRS)

    Lawton, W. M.

    1975-01-01

    A Robbins-Monro type multidimensional stochastic approximation algorithm which converges in mean square and with probability one to the fixed point of a locally contractive regression function is developed. The algorithm is applied to obtain maximum likelihood estimates of the parameters for a mixture of multivariate normal distributions.

  8. Lake bed classification using acoustic data

    USGS Publications Warehouse

    Yin, Karen K.; Li, Xing; Bonde, John; Richards, Carl; Cholwek, Gary

    1998-01-01

    As part of our effort to identify the lake bed surficial substrates using remote sensing data, this work designs pattern classifiers by multivariate statistical methods. Probability distribution of the preprocessed acoustic signal is analyzed first. A confidence region approach is then adopted to improve the design of the existing classifier. A technique for further isolation is proposed which minimizes the expected loss from misclassification. The devices constructed are applicable for real-time lake bed categorization. A mimimax approach is suggested to treat more general cases where the a priori probability distribution of the substrate types is unknown. Comparison of the suggested methods with the traditional likelihood ratio tests is discussed.

  9. Probabilistic structural analysis methods and applications

    NASA Technical Reports Server (NTRS)

    Cruse, T. A.; Wu, Y.-T.; Dias, B.; Rajagopal, K. R.

    1988-01-01

    An advanced algorithm for simulating the probabilistic distribution of structural responses due to statistical uncertainties in loads, geometry, material properties, and boundary conditions is reported. The method effectively combines an advanced algorithm for calculating probability levels for multivariate problems (fast probability integration) together with a general-purpose finite-element code for stress, vibration, and buckling analysis. Application is made to a space propulsion system turbine blade for which the geometry and material properties are treated as random variables.

  10. Some New Approaches to Multivariate Probability Distributions.

    DTIC Science & Technology

    1986-12-01

    Krishnaiah (1977). The following example may serve as an illustration of this point. EXAMPLE 2. (Fre^*chet’s bivariate continuous distribution...the error in the theorem of "" Prakasa Rao (1974) and to Dr. P.R. Krishnaiah for his valuable comments on the initial draft, his monumental patience and...M. and Proschan, F. (1984). Nonparametric Concepts and Methods in Reliability, Handbook of Statistics, 4, 613-655, (eds. P.R. Krishnaiah and P.K

  11. Fast-NPS-A Markov Chain Monte Carlo-based analysis tool to obtain structural information from single-molecule FRET measurements

    NASA Astrophysics Data System (ADS)

    Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens

    2017-10-01

    The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.

  12. Multivariate probability distribution for sewer system vulnerability assessment under data-limited conditions.

    PubMed

    Del Giudice, G; Padulano, R; Siciliano, D

    2016-01-01

    The lack of geometrical and hydraulic information about sewer networks often excludes the adoption of in-deep modeling tools to obtain prioritization strategies for funds management. The present paper describes a novel statistical procedure for defining the prioritization scheme for preventive maintenance strategies based on a small sample of failure data collected by the Sewer Office of the Municipality of Naples (IT). Novelty issues involve, among others, considering sewer parameters as continuous statistical variables and accounting for their interdependences. After a statistical analysis of maintenance interventions, the most important available factors affecting the process are selected and their mutual correlations identified. Then, after a Box-Cox transformation of the original variables, a methodology is provided for the evaluation of a vulnerability map of the sewer network by adopting a joint multivariate normal distribution with different parameter sets. The goodness-of-fit is eventually tested for each distribution by means of a multivariate plotting position. The developed methodology is expected to assist municipal engineers in identifying critical sewers, prioritizing sewer inspections in order to fulfill rehabilitation requirements.

  13. Grading system to categorize breast MRI using BI-RADS 5th edition: a statistical study of non-mass enhancement descriptors in terms of probability of malignancy.

    PubMed

    Asada, Tatsunori; Yamada, Takayuki; Kanemaki, Yoshihide; Fujiwara, Keishi; Okamoto, Satoko; Nakajima, Yasuo

    2018-03-01

    To analyze the association of breast non-mass enhancement descriptors in the BI-RADS 5th edition with malignancy, and to establish a grading system and categorization of descriptors. This study was approved by our institutional review board. A total of 213 patients were enrolled. Breast MRI was performed with a 1.5-T MRI scanner using a 16-channel breast radiofrequency coil. Two radiologists determined internal enhancement and distribution of non-mass enhancement by consensus. Corresponding pathologic diagnoses were obtained by either biopsy or surgery. The probability of malignancy by descriptor was analyzed using Fisher's exact test and multivariate logistic regression analysis. The probability of malignancy by category was analyzed using Fisher's exact and multi-group comparison tests. One hundred seventy-eight lesions were malignant. Multivariate model analysis showed that internal enhancement (homogeneous vs others, p < 0.001, heterogeneous and clumped vs clustered ring, p = 0.003) and distribution (focal and linear vs segmental, p < 0.001) were the significant explanatory variables. The descriptors were classified into three grades of suspicion, and the categorization (3, 4A, 4B, 4C, and 5) by sum-up grades showed an incremental increase in the probability of malignancy (p < 0.0001). The three-grade criteria and categorization by sum-up grades of descriptors appear valid for non-mass enhancement.

  14. Diffuse reflection from a stochastically bounded, semi-infinite medium

    NASA Technical Reports Server (NTRS)

    Lumme, K.; Peltoniemi, J. I.; Irvine, W. M.

    1990-01-01

    In order to determine the diffuse reflection from a medium bounded by a rough surface, the problem of radiative transfer in a boundary layer characterized by a statistical distribution of heights is considered. For the case that the surface is defined by a multivariate normal probability density, the propagation probability for rays traversing the boundary layer is derived and, from that probability, a corresponding radiative transfer equation. A solution of the Eddington (two stream) type is found explicitly, and examples are given. The results should be applicable to reflection from the regoliths of solar system bodies, as well as from a rough ocean surface.

  15. Risk of false decision on conformity of a multicomponent material when test results of the components' content are correlated.

    PubMed

    Kuselman, Ilya; Pennecchi, Francesca R; da Silva, Ricardo J N B; Hibbert, D Brynn

    2017-11-01

    The probability of a false decision on conformity of a multicomponent material due to measurement uncertainty is discussed when test results are correlated. Specification limits of the components' content of such a material generate a multivariate specification interval/domain. When true values of components' content and corresponding test results are modelled by multivariate distributions (e.g. by multivariate normal distributions), a total global risk of a false decision on the material conformity can be evaluated based on calculation of integrals of their joint probability density function. No transformation of the raw data is required for that. A total specific risk can be evaluated as the joint posterior cumulative function of true values of a specific batch or lot lying outside the multivariate specification domain, when the vector of test results, obtained for the lot, is inside this domain. It was shown, using a case study of four components under control in a drug, that the correlation influence on the risk value is not easily predictable. To assess this influence, the evaluated total risk values were compared with those calculated for independent test results and also with those assuming much stronger correlation than that observed. While the observed statistically significant correlation did not lead to a visible difference in the total risk values in comparison to the independent test results, the stronger correlation among the variables caused either the total risk decreasing or its increasing, depending on the actual values of the test results. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Drought forecasting in Luanhe River basin involving climatic indices

    NASA Astrophysics Data System (ADS)

    Ren, Weinan; Wang, Yixuan; Li, Jianzhu; Feng, Ping; Smith, Ronald J.

    2017-11-01

    Drought is regarded as one of the most severe natural disasters globally. This is especially the case in Tianjin City, Northern China, where drought can affect economic development and people's livelihoods. Drought forecasting, the basis of drought management, is an important mitigation strategy. In this paper, we evolve a probabilistic forecasting model, which forecasts transition probabilities from a current Standardized Precipitation Index (SPI) value to a future SPI class, based on conditional distribution of multivariate normal distribution to involve two large-scale climatic indices at the same time, and apply the forecasting model to 26 rain gauges in the Luanhe River basin in North China. The establishment of the model and the derivation of the SPI are based on the hypothesis of aggregated monthly precipitation that is normally distributed. Pearson correlation and Shapiro-Wilk normality tests are used to select appropriate SPI time scale and large-scale climatic indices. Findings indicated that longer-term aggregated monthly precipitation, in general, was more likely to be considered normally distributed and forecasting models should be applied to each gauge, respectively, rather than to the whole basin. Taking Liying Gauge as an example, we illustrate the impact of the SPI time scale and lead time on transition probabilities. Then, the controlled climatic indices of every gauge are selected by Pearson correlation test and the multivariate normality of SPI, corresponding climatic indices for current month and SPI 1, 2, and 3 months later are demonstrated using Shapiro-Wilk normality test. Subsequently, we illustrate the impact of large-scale oceanic-atmospheric circulation patterns on transition probabilities. Finally, we use a score method to evaluate and compare the performance of the three forecasting models and compare them with two traditional models which forecast transition probabilities from a current to a future SPI class. The results show that the three proposed models outperform the two traditional models and involving large-scale climatic indices can improve the forecasting accuracy.

  17. Bayesian statistics and Monte Carlo methods

    NASA Astrophysics Data System (ADS)

    Koch, K. R.

    2018-03-01

    The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.

  18. A Bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites

    NASA Astrophysics Data System (ADS)

    Wang, Q. J.; Robertson, D. E.; Chiew, F. H. S.

    2009-05-01

    Seasonal forecasting of streamflows can be highly valuable for water resources management. In this paper, a Bayesian joint probability (BJP) modeling approach for seasonal forecasting of streamflows at multiple sites is presented. A Box-Cox transformed multivariate normal distribution is proposed to model the joint distribution of future streamflows and their predictors such as antecedent streamflows and El Niño-Southern Oscillation indices and other climate indicators. Bayesian inference of model parameters and uncertainties is implemented using Markov chain Monte Carlo sampling, leading to joint probabilistic forecasts of streamflows at multiple sites. The model provides a parametric structure for quantifying relationships between variables, including intersite correlations. The Box-Cox transformed multivariate normal distribution has considerable flexibility for modeling a wide range of predictors and predictands. The Bayesian inference formulated allows the use of data that contain nonconcurrent and missing records. The model flexibility and data-handling ability means that the BJP modeling approach is potentially of wide practical application. The paper also presents a number of statistical measures and graphical methods for verification of probabilistic forecasts of continuous variables. Results for streamflows at three river gauges in the Murrumbidgee River catchment in southeast Australia show that the BJP modeling approach has good forecast quality and that the fitted model is consistent with observed data.

  19. New Developments in Uncertainty: Linking Risk Management, Reliability, Statistics and Stochastic Optimization

    DTIC Science & Technology

    2014-11-13

    Cm) in a given set C ⊂ IRm . (5.7) Motivation for generalized regression comes from applications in which Y has the cost/loss orien- tation that we have...distribution. The corresponding probability measure on IRm is induced then by the multivariate distribution function FV1,...,Vm(v1, . . . , vm) = prob { (V1...could be generated by future observations of some variables V1, . . . , Vm, as above, in which case Ω would be a subset of IRm with elements ω = (v1

  20. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    PubMed

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  1. The PIT-trap—A “model-free” bootstrap procedure for inference about regression models with discrete, multivariate responses

    PubMed Central

    Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071

  2. Accumulation risk assessment for the flooding hazard

    NASA Astrophysics Data System (ADS)

    Roth, Giorgio; Ghizzoni, Tatiana; Rudari, Roberto

    2010-05-01

    One of the main consequences of the demographic and economic development and of markets and trades globalization is represented by risks cumulus. In most cases, the cumulus of risks intuitively arises from the geographic concentration of a number of vulnerable elements in a single place. For natural events, risks cumulus can be associated, in addition to intensity, also to event's extension. In this case, the magnitude can be such that large areas, that may include many regions or even large portions of different countries, are stroked by single, catastrophic, events. Among natural risks, the impact of the flooding hazard cannot be understated. To cope with, a variety of mitigation actions can be put in place: from the improvement of monitoring and alert systems to the development of hydraulic structures, throughout land use restrictions, civil protection, financial and insurance plans. All of those viable options present social and economic impacts, either positive or negative, whose proper estimate should rely on the assumption of appropriate - present and future - flood risk scenarios. It is therefore necessary to identify proper statistical methodologies, able to describe the multivariate aspects of the involved physical processes and their spatial dependence. In hydrology and meteorology, but also in finance and insurance practice, it has early been recognized that classical statistical theory distributions (e.g., the normal and gamma families) are of restricted use for modeling multivariate spatial data. Recent research efforts have been therefore directed towards developing statistical models capable of describing the forms of asymmetry manifest in data sets. This, in particular, for the quite frequent case of phenomena whose empirical outcome behaves in a non-normal fashion, but still maintains some broad similarity with the multivariate normal distribution. Fruitful approaches were recognized in the use of flexible models, which include the normal distribution as a special or limiting case (e.g., the skew-normal or skew-t distributions). The present contribution constitutes an attempt to provide a better estimation of the joint probability distribution able to describe flood events in a multi-site multi-basin fashion. This goal will be pursued through the multivariate skew-t distribution, which allows to analytically define the joint probability distribution. Performances of the skew-t distribution will be discussed with reference to the Tanaro River in Northwestern Italy. To enhance the characteristics of the correlation structure, both nested and non-nested gauging stations will be selected, with significantly different contributing areas.

  3. The return period analysis of natural disasters with statistical modeling of bivariate joint probability distribution.

    PubMed

    Li, Ning; Liu, Xueqin; Xie, Wei; Wu, Jidong; Zhang, Peng

    2013-01-01

    New features of natural disasters have been observed over the last several years. The factors that influence the disasters' formation mechanisms, regularity of occurrence and main characteristics have been revealed to be more complicated and diverse in nature than previously thought. As the uncertainty involved increases, the variables need to be examined further. This article discusses the importance and the shortage of multivariate analysis of natural disasters and presents a method to estimate the joint probability of the return periods and perform a risk analysis. Severe dust storms from 1990 to 2008 in Inner Mongolia were used as a case study to test this new methodology, as they are normal and recurring climatic phenomena on Earth. Based on the 79 investigated events and according to the dust storm definition with bivariate, the joint probability distribution of severe dust storms was established using the observed data of maximum wind speed and duration. The joint return periods of severe dust storms were calculated, and the relevant risk was analyzed according to the joint probability. The copula function is able to simulate severe dust storm disasters accurately. The joint return periods generated are closer to those observed in reality than the univariate return periods and thus have more value in severe dust storm disaster mitigation, strategy making, program design, and improvement of risk management. This research may prove useful in risk-based decision making. The exploration of multivariate analysis methods can also lay the foundation for further applications in natural disaster risk analysis. © 2012 Society for Risk Analysis.

  4. A Multivariate and Probabilistic Assessment of Drought in the Pacific Northwest under Observed and Future Climate.

    NASA Astrophysics Data System (ADS)

    Mortuza, M. R.; Demissie, Y. K.

    2015-12-01

    In lieu with the recent and anticipated more server and frequently droughts incidences in Yakima River Basin (YRB), a reliable and comprehensive drought assessment is deemed necessary to avoid major crop production loss and better manage the water right issues in the region during low precipitation and/or snow accumulation years. In this study, we have conducted frequency analysis of hydrological droughts and quantified associated uncertainty in the YRB under both historical and changing climate. Streamflow drought index (SDI) was employed to identify mutually correlated drought characteristics (e.g., severity, duration and peak). The historical and future characteristics of drought were estimated by applying tri-variate copulas probability distribution, which effectively describe the joint distribution and dependence of drought severity, duration, and peak. The associated prediction uncertainty, related to parameters of the joint probability and climate projections, were evaluated using the Bayesian approach with bootstrap resampling. For the climate change scenarios, two future representative pathways (RCP4.5 and RCP8.5) from University of Idaho's Multivariate Adaptive Constructed Analogs (MACA) database were considered. The results from the study are expected to provide useful information towards drought risk management in YRB under anticipated climate changes.

  5. Assessment of Coastal and Urban Flooding Hazards Applying Extreme Value Analysis and Multivariate Statistical Techniques: A Case Study in Elwood, Australia

    NASA Astrophysics Data System (ADS)

    Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik

    2016-04-01

    Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.

  6. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

    PubMed

    Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

    2017-05-07

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  7. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

    NASA Astrophysics Data System (ADS)

    Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

    2017-05-01

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  8. Use of collateral information to improve LANDSAT classification accuracies

    NASA Technical Reports Server (NTRS)

    Strahler, A. H. (Principal Investigator)

    1981-01-01

    Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.

  9. The use of copulas to practical estimation of multivariate stochastic differential equation mixed effects models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rupšys, P.

    A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.

  10. Understanding characteristics in multivariate traffic flow time series from complex network structure

    NASA Astrophysics Data System (ADS)

    Yan, Ying; Zhang, Shen; Tang, Jinjun; Wang, Xiaofei

    2017-07-01

    Discovering dynamic characteristics in traffic flow is the significant step to design effective traffic managing and controlling strategy for relieving traffic congestion in urban cities. A new method based on complex network theory is proposed to study multivariate traffic flow time series. The data were collected from loop detectors on freeway during a year. In order to construct complex network from original traffic flow, a weighted Froenius norm is adopt to estimate similarity between multivariate time series, and Principal Component Analysis is implemented to determine the weights. We discuss how to select optimal critical threshold for networks at different hour in term of cumulative probability distribution of degree. Furthermore, two statistical properties of networks: normalized network structure entropy and cumulative probability of degree, are utilized to explore hourly variation in traffic flow. The results demonstrate these two statistical quantities express similar pattern to traffic flow parameters with morning and evening peak hours. Accordingly, we detect three traffic states: trough, peak and transitional hours, according to the correlation between two aforementioned properties. The classifying results of states can actually represent hourly fluctuation in traffic flow by analyzing annual average hourly values of traffic volume, occupancy and speed in corresponding hours.

  11. Multivariate non-normally distributed random variables in climate research - introduction to the copula approach

    NASA Astrophysics Data System (ADS)

    Schölzel, C.; Friederichs, P.

    2008-10-01

    Probability distributions of multivariate random variables are generally more complex compared to their univariate counterparts which is due to a possible nonlinear dependence between the random variables. One approach to this problem is the use of copulas, which have become popular over recent years, especially in fields like econometrics, finance, risk management, or insurance. Since this newly emerging field includes various practices, a controversial discussion, and vast field of literature, it is difficult to get an overview. The aim of this paper is therefore to provide an brief overview of copulas for application in meteorology and climate research. We examine the advantages and disadvantages compared to alternative approaches like e.g. mixture models, summarize the current problem of goodness-of-fit (GOF) tests for copulas, and discuss the connection with multivariate extremes. An application to station data shows the simplicity and the capabilities as well as the limitations of this approach. Observations of daily precipitation and temperature are fitted to a bivariate model and demonstrate, that copulas are valuable complement to the commonly used methods.

  12. Bayesian probability of success for clinical trials using historical data

    PubMed Central

    Ibrahim, Joseph G.; Chen, Ming-Hui; Lakshminarayanan, Mani; Liu, Guanghan F.; Heyse, Joseph F.

    2015-01-01

    Developing sophisticated statistical methods for go/no-go decisions is crucial for clinical trials, as planning phase III or phase IV trials is costly and time consuming. In this paper, we develop a novel Bayesian methodology for determining the probability of success of a treatment regimen on the basis of the current data of a given trial. We introduce a new criterion for calculating the probability of success that allows for inclusion of covariates as well as allowing for historical data based on the treatment regimen, and patient characteristics. A new class of prior distributions and covariate distributions is developed to achieve this goal. The methodology is quite general and can be used with univariate or multivariate continuous or discrete data, and it generalizes Chuang-Stein’s work. This methodology will be invaluable for informing the scientist on the likelihood of success of the compound, while including the information of covariates for patient characteristics in the trial population for planning future pre-market or post-market trials. PMID:25339499

  13. Bayesian probability of success for clinical trials using historical data.

    PubMed

    Ibrahim, Joseph G; Chen, Ming-Hui; Lakshminarayanan, Mani; Liu, Guanghan F; Heyse, Joseph F

    2015-01-30

    Developing sophisticated statistical methods for go/no-go decisions is crucial for clinical trials, as planning phase III or phase IV trials is costly and time consuming. In this paper, we develop a novel Bayesian methodology for determining the probability of success of a treatment regimen on the basis of the current data of a given trial. We introduce a new criterion for calculating the probability of success that allows for inclusion of covariates as well as allowing for historical data based on the treatment regimen, and patient characteristics. A new class of prior distributions and covariate distributions is developed to achieve this goal. The methodology is quite general and can be used with univariate or multivariate continuous or discrete data, and it generalizes Chuang-Stein's work. This methodology will be invaluable for informing the scientist on the likelihood of success of the compound, while including the information of covariates for patient characteristics in the trial population for planning future pre-market or post-market trials. Copyright © 2014 John Wiley & Sons, Ltd.

  14. Idealized models of the joint probability distribution of wind speeds

    NASA Astrophysics Data System (ADS)

    Monahan, Adam H.

    2018-05-01

    The joint probability distribution of wind speeds at two separate locations in space or points in time completely characterizes the statistical dependence of these two quantities, providing more information than linear measures such as correlation. In this study, we consider two models of the joint distribution of wind speeds obtained from idealized models of the dependence structure of the horizontal wind velocity components. The bivariate Rice distribution follows from assuming that the wind components have Gaussian and isotropic fluctuations. The bivariate Weibull distribution arises from power law transformations of wind speeds corresponding to vector components with Gaussian, isotropic, mean-zero variability. Maximum likelihood estimates of these distributions are compared using wind speed data from the mid-troposphere, from different altitudes at the Cabauw tower in the Netherlands, and from scatterometer observations over the sea surface. While the bivariate Rice distribution is more flexible and can represent a broader class of dependence structures, the bivariate Weibull distribution is mathematically simpler and may be more convenient in many applications. The complexity of the mathematical expressions obtained for the joint distributions suggests that the development of explicit functional forms for multivariate speed distributions from distributions of the components will not be practical for more complicated dependence structure or more than two speed variables.

  15. A constrained multinomial Probit route choice model in the metro network: Formulation, estimation and application

    PubMed Central

    Zhang, Yongsheng; Wei, Heng; Zheng, Kangning

    2017-01-01

    Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188

  16. A randomised approach for NARX model identification based on a multivariate Bernoulli distribution

    NASA Astrophysics Data System (ADS)

    Bianchi, F.; Falsone, A.; Prandini, M.; Piroddi, L.

    2017-04-01

    The identification of polynomial NARX models is typically performed by incremental model building techniques. These methods assess the importance of each regressor based on the evaluation of partial individual models, which may ultimately lead to erroneous model selections. A more robust assessment of the significance of a specific model term can be obtained by considering ensembles of models, as done by the RaMSS algorithm. In that context, the identification task is formulated in a probabilistic fashion and a Bernoulli distribution is employed to represent the probability that a regressor belongs to the target model. Then, samples of the model distribution are collected to gather reliable information to update it, until convergence to a specific model. The basic RaMSS algorithm employs multiple independent univariate Bernoulli distributions associated to the different candidate model terms, thus overlooking the correlations between different terms, which are typically important in the selection process. Here, a multivariate Bernoulli distribution is employed, in which the sampling of a given term is conditioned by the sampling of the others. The added complexity inherent in considering the regressor correlation properties is more than compensated by the achievable improvements in terms of accuracy of the model selection process.

  17. Modeling Multi-Variate Gaussian Distributions and Analysis of Higgs Boson Couplings with the ATLAS Detector

    NASA Astrophysics Data System (ADS)

    Krohn, Olivia; Armbruster, Aaron; Gao, Yongsheng; Atlas Collaboration

    2017-01-01

    Software tools developed for the purpose of modeling CERN LHC pp collision data to aid in its interpretation are presented. Some measurements are not adequately described by a Gaussian distribution; thus an interpretation assuming Gaussian uncertainties will inevitably introduce bias, necessitating analytical tools to recreate and evaluate non-Gaussian features. One example is the measurements of Higgs boson production rates in different decay channels, and the interpretation of these measurements. The ratios of data to Standard Model expectations (μ) for five arbitrary signals were modeled by building five Poisson distributions with mixed signal contributions such that the measured values of μ are correlated. Algorithms were designed to recreate probability distribution functions of μ as multi-variate Gaussians, where the standard deviation (σ) and correlation coefficients (ρ) are parametrized. There was good success with modeling 1-D likelihood contours of μ, and the multi-dimensional distributions were well modeled within 1- σ but the model began to diverge after 2- σ due to unmerited assumptions in developing ρ. Future plans to improve the algorithms and develop a user-friendly analysis package will also be discussed. NSF International Research Experiences for Students

  18. Geotechnical parameter spatial distribution stochastic analysis based on multi-precision information assimilation

    NASA Astrophysics Data System (ADS)

    Wang, C.; Rubin, Y.

    2014-12-01

    Spatial distribution of important geotechnical parameter named compression modulus Es contributes considerably to the understanding of the underlying geological processes and the adequate assessment of the Es mechanics effects for differential settlement of large continuous structure foundation. These analyses should be derived using an assimilating approach that combines in-situ static cone penetration test (CPT) with borehole experiments. To achieve such a task, the Es distribution of stratum of silty clay in region A of China Expo Center (Shanghai) is studied using the Bayesian-maximum entropy method. This method integrates rigorously and efficiently multi-precision of different geotechnical investigations and sources of uncertainty. Single CPT samplings were modeled as a rational probability density curve by maximum entropy theory. Spatial prior multivariate probability density function (PDF) and likelihood PDF of the CPT positions were built by borehole experiments and the potential value of the prediction point, then, preceding numerical integration on the CPT probability density curves, the posterior probability density curve of the prediction point would be calculated by the Bayesian reverse interpolation framework. The results were compared between Gaussian Sequential Stochastic Simulation and Bayesian methods. The differences were also discussed between single CPT samplings of normal distribution and simulated probability density curve based on maximum entropy theory. It is shown that the study of Es spatial distributions can be improved by properly incorporating CPT sampling variation into interpolation process, whereas more informative estimations are generated by considering CPT Uncertainty for the estimation points. Calculation illustrates the significance of stochastic Es characterization in a stratum, and identifies limitations associated with inadequate geostatistical interpolation techniques. This characterization results will provide a multi-precision information assimilation method of other geotechnical parameters.

  19. Spatial estimation from remotely sensed data via empirical Bayes models

    NASA Technical Reports Server (NTRS)

    Hill, J. R.; Hinkley, D. V.; Kostal, H.; Morris, C. N.

    1984-01-01

    Multichannel satellite image data, available as LANDSAT imagery, are recorded as a multivariate time series (four channels, multiple passovers) in two spatial dimensions. The application of parametric empirical Bayes theory to classification of, and estimating the probability of, each crop type at each of a large number of pixels is considered. This theory involves both the probability distribution of imagery data, conditional on crop types, and the prior spatial distribution of crop types. For the latter Markov models indexed by estimable parameters are used. A broad outline of the general theory reveals several questions for further research. Some detailed results are given for the special case of two crop types when only a line transect is analyzed. Finally, the estimation of an underlying continuous process on the lattice is discussed which would be applicable to such quantities as crop yield.

  20. A probabilistic multi-criteria decision making technique for conceptual and preliminary aerospace systems design

    NASA Astrophysics Data System (ADS)

    Bandte, Oliver

    It has always been the intention of systems engineering to invent or produce the best product possible. Many design techniques have been introduced over the course of decades that try to fulfill this intention. Unfortunately, no technique has succeeded in combining multi-criteria decision making with probabilistic design. The design technique developed in this thesis, the Joint Probabilistic Decision Making (JPDM) technique, successfully overcomes this deficiency by generating a multivariate probability distribution that serves in conjunction with a criterion value range of interest as a universally applicable objective function for multi-criteria optimization and product selection. This new objective function constitutes a meaningful Xnetric, called Probability of Success (POS), that allows the customer or designer to make a decision based on the chance of satisfying the customer's goals. In order to incorporate a joint probabilistic formulation into the systems design process, two algorithms are created that allow for an easy implementation into a numerical design framework: the (multivariate) Empirical Distribution Function and the Joint Probability Model. The Empirical Distribution Function estimates the probability that an event occurred by counting how many times it occurred in a given sample. The Joint Probability Model on the other hand is an analytical parametric model for the multivariate joint probability. It is comprised of the product of the univariate criterion distributions, generated by the traditional probabilistic design process, multiplied with a correlation function that is based on available correlation information between pairs of random variables. JPDM is an excellent tool for multi-objective optimization and product selection, because of its ability to transform disparate objectives into a single figure of merit, the likelihood of successfully meeting all goals or POS. The advantage of JPDM over other multi-criteria decision making techniques is that POS constitutes a single optimizable function or metric that enables a comparison of all alternative solutions on an equal basis. Hence, POS allows for the use of any standard single-objective optimization technique available and simplifies a complex multi-criteria selection problem into a simple ordering problem, where the solution with the highest POS is best. By distinguishing between controllable and uncontrollable variables in the design process, JPDM can account for the uncertain values of the uncontrollable variables that are inherent to the design problem, while facilitating an easy adjustment of the controllable ones to achieve the highest possible POS. Finally, JPDM's superiority over current multi-criteria decision making techniques is demonstrated with an optimization of a supersonic transport concept and ten contrived equations as well as a product selection example, determining an airline's best choice among Boeing's B-747, B-777, Airbus' A340, and a Supersonic Transport. The optimization examples demonstrate JPDM's ability to produce a better solution with a higher POS than an Overall Evaluation Criterion or Goal Programming approach. Similarly, the product selection example demonstrates JPDM's ability to produce a better solution with a higher POS and different ranking than the Overall Evaluation Criterion or Technique for Order Preferences by Similarity to the Ideal Solution (TOPSIS) approach.

  1. Bayesian bivariate meta-analysis of correlated effects: Impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences

    PubMed Central

    Bujkiewicz, Sylwia; Riley, Richard D

    2016-01-01

    Multivariate random-effects meta-analysis allows the joint synthesis of correlated results from multiple studies, for example, for multiple outcomes or multiple treatment groups. In a Bayesian univariate meta-analysis of one endpoint, the importance of specifying a sensible prior distribution for the between-study variance is well understood. However, in multivariate meta-analysis, there is little guidance about the choice of prior distributions for the variances or, crucially, the between-study correlation, ρB; for the latter, researchers often use a Uniform(−1,1) distribution assuming it is vague. In this paper, an extensive simulation study and a real illustrative example is used to examine the impact of various (realistically) vague prior distributions for ρB and the between-study variances within a Bayesian bivariate random-effects meta-analysis of two correlated treatment effects. A range of diverse scenarios are considered, including complete and missing data, to examine the impact of the prior distributions on posterior results (for treatment effect and between-study correlation), amount of borrowing of strength, and joint predictive distributions of treatment effectiveness in new studies. Two key recommendations are identified to improve the robustness of multivariate meta-analysis results. First, the routine use of a Uniform(−1,1) prior distribution for ρB should be avoided, if possible, as it is not necessarily vague. Instead, researchers should identify a sensible prior distribution, for example, by restricting values to be positive or negative as indicated by prior knowledge. Second, it remains critical to use sensible (e.g. empirically based) prior distributions for the between-study variances, as an inappropriate choice can adversely impact the posterior distribution for ρB, which may then adversely affect inferences such as joint predictive probabilities. These recommendations are especially important with a small number of studies and missing data. PMID:26988929

  2. Multi-hazard Assessment and Scenario Toolbox (MhAST): A Framework for Analyzing Compounding Effects of Multiple Hazards

    NASA Astrophysics Data System (ADS)

    Sadegh, M.; Moftakhari, H.; AghaKouchak, A.

    2017-12-01

    Many natural hazards are driven by multiple forcing variables, and concurrence/consecutive extreme events significantly increases risk of infrastructure/system failure. It is a common practice to use univariate analysis based upon a perceived ruling driver to estimate design quantiles and/or return periods of extreme events. A multivariate analysis, however, permits modeling simultaneous occurrence of multiple forcing variables. In this presentation, we introduce the Multi-hazard Assessment and Scenario Toolbox (MhAST) that comprehensively analyzes marginal and joint probability distributions of natural hazards. MhAST also offers a wide range of scenarios of return period and design levels and their likelihoods. Contribution of this study is four-fold: 1. comprehensive analysis of marginal and joint probability of multiple drivers through 17 continuous distributions and 26 copulas, 2. multiple scenario analysis of concurrent extremes based upon the most likely joint occurrence, one ruling variable, and weighted random sampling of joint occurrences with similar exceedance probabilities, 3. weighted average scenario analysis based on a expected event, and 4. uncertainty analysis of the most likely joint occurrence scenario using a Bayesian framework.

  3. The discrete Laplace exponential family and estimation of Y-STR haplotype frequencies.

    PubMed

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2013-07-21

    Estimating haplotype frequencies is important in e.g. forensic genetics, where the frequencies are needed to calculate the likelihood ratio for the evidential weight of a DNA profile found at a crime scene. Estimation is naturally based on a population model, motivating the investigation of the Fisher-Wright model of evolution for haploid lineage DNA markers. An exponential family (a class of probability distributions that is well understood in probability theory such that inference is easily made by using existing software) called the 'discrete Laplace distribution' is described. We illustrate how well the discrete Laplace distribution approximates a more complicated distribution that arises by investigating the well-known population genetic Fisher-Wright model of evolution by a single-step mutation process. It was shown how the discrete Laplace distribution can be used to estimate haplotype frequencies for haploid lineage DNA markers (such as Y-chromosomal short tandem repeats), which in turn can be used to assess the evidential weight of a DNA profile found at a crime scene. This was done by making inference in a mixture of multivariate, marginally independent, discrete Laplace distributions using the EM algorithm to estimate the probabilities of membership of a set of unobserved subpopulations. The discrete Laplace distribution can be used to estimate haplotype frequencies with lower prediction error than other existing estimators. Furthermore, the calculations could be performed on a normal computer. This method was implemented in the freely available open source software R that is supported on Linux, MacOS and MS Windows. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Optimal moment determination in POME-copula based hydrometeorological dependence modelling

    NASA Astrophysics Data System (ADS)

    Liu, Dengfeng; Wang, Dong; Singh, Vijay P.; Wang, Yuankun; Wu, Jichun; Wang, Lachun; Zou, Xinqing; Chen, Yuanfang; Chen, Xi

    2017-07-01

    Copula has been commonly applied in multivariate modelling in various fields where marginal distribution inference is a key element. To develop a flexible, unbiased mathematical inference framework in hydrometeorological multivariate applications, the principle of maximum entropy (POME) is being increasingly coupled with copula. However, in previous POME-based studies, determination of optimal moment constraints has generally not been considered. The main contribution of this study is the determination of optimal moments for POME for developing a coupled optimal moment-POME-copula framework to model hydrometeorological multivariate events. In this framework, margins (marginals, or marginal distributions) are derived with the use of POME, subject to optimal moment constraints. Then, various candidate copulas are constructed according to the derived margins, and finally the most probable one is determined, based on goodness-of-fit statistics. This optimal moment-POME-copula framework is applied to model the dependence patterns of three types of hydrometeorological events: (i) single-site streamflow-water level; (ii) multi-site streamflow; and (iii) multi-site precipitation, with data collected from Yichang and Hankou in the Yangtze River basin, China. Results indicate that the optimal-moment POME is more accurate in margin fitting and the corresponding copulas reflect a good statistical performance in correlation simulation. Also, the derived copulas, capturing more patterns which traditional correlation coefficients cannot reflect, provide an efficient way in other applied scenarios concerning hydrometeorological multivariate modelling.

  5. A New Approach in Generating Meteorological Forecasts for Ensemble Streamflow Forecasting using Multivariate Functions

    NASA Astrophysics Data System (ADS)

    Khajehei, S.; Madadgar, S.; Moradkhani, H.

    2014-12-01

    The reliability and accuracy of hydrological predictions are subject to various sources of uncertainty, including meteorological forcing, initial conditions, model parameters and model structure. To reduce the total uncertainty in hydrological applications, one approach is to reduce the uncertainty in meteorological forcing by using the statistical methods based on the conditional probability density functions (pdf). However, one of the requirements for current methods is to assume the Gaussian distribution for the marginal distribution of the observed and modeled meteorology. Here we propose a Bayesian approach based on Copula functions to develop the conditional distribution of precipitation forecast needed in deriving a hydrologic model for a sub-basin in the Columbia River Basin. Copula functions are introduced as an alternative approach in capturing the uncertainties related to meteorological forcing. Copulas are multivariate joint distribution of univariate marginal distributions, which are capable to model the joint behavior of variables with any level of correlation and dependency. The method is applied to the monthly forecast of CPC with 0.25x0.25 degree resolution to reproduce the PRISM dataset over 1970-2000. Results are compared with Ensemble Pre-Processor approach as a common procedure used by National Weather Service River forecast centers in reproducing observed climatology during a ten-year verification period (2000-2010).

  6. Small-Noise Analysis and Symmetrization of Implicit Monte Carlo Samplers

    DOE PAGES

    Goodman, Jonathan; Lin, Kevin K.; Morzfeld, Matthias

    2015-07-06

    Implicit samplers are algorithms for producing independent, weighted samples from multivariate probability distributions. These are often applied in Bayesian data assimilation algorithms. We use Laplace asymptotic expansions to analyze two implicit samplers in the small noise regime. Our analysis suggests a symmetrization of the algorithms that leads to improved implicit sampling schemes at a relatively small additional cost. Here, computational experiments confirm the theory and show that symmetrization is effective for small noise sampling problems.

  7. Gaussian closure technique applied to the hysteretic Bouc model with non-zero mean white noise excitation

    NASA Astrophysics Data System (ADS)

    Waubke, Holger; Kasess, Christian H.

    2016-11-01

    Devices that emit structure-borne sound are commonly decoupled by elastic components to shield the environment from acoustical noise and vibrations. The elastic elements often have a hysteretic behavior that is typically neglected. In order to take hysteretic behavior into account, Bouc developed a differential equation for such materials, especially joints made of rubber or equipped with dampers. In this work, the Bouc model is solved by means of the Gaussian closure technique based on the Kolmogorov equation. Kolmogorov developed a method to derive probability density functions for arbitrary explicit first-order vector differential equations under white noise excitation using a partial differential equation of a multivariate conditional probability distribution. Up to now no analytical solution of the Kolmogorov equation in conjunction with the Bouc model exists. Therefore a wide range of approximate solutions, especially the statistical linearization, were developed. Using the Gaussian closure technique that is an approximation to the Kolmogorov equation assuming a multivariate Gaussian distribution an analytic solution is derived in this paper for the Bouc model. For the stationary case the two methods yield equivalent results, however, in contrast to statistical linearization the presented solution allows to calculate the transient behavior explicitly. Further, stationary case leads to an implicit set of equations that can be solved iteratively with a small number of iterations and without instabilities for specific parameter sets.

  8. Bayesian soft X-ray tomography using non-stationary Gaussian Processes

    NASA Astrophysics Data System (ADS)

    Li, Dong; Svensson, J.; Thomsen, H.; Medina, F.; Werner, A.; Wolf, R.

    2013-08-01

    In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.

  9. Bayesian soft X-ray tomography using non-stationary Gaussian Processes.

    PubMed

    Li, Dong; Svensson, J; Thomsen, H; Medina, F; Werner, A; Wolf, R

    2013-08-01

    In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.

  10. Generation of multivariate near shore extreme wave conditions based on an extreme value copula for offshore boundary conditions.

    NASA Astrophysics Data System (ADS)

    Leyssen, Gert; Mercelis, Peter; De Schoesitter, Philippe; Blanckaert, Joris

    2013-04-01

    Near shore extreme wave conditions, used as input for numerical wave agitation simulations and for the dimensioning of coastal defense structures, need to be determined at a harbour entrance situated at the French North Sea coast. To obtain significant wave heights, the numerical wave model SWAN has been used. A multivariate approach was used to account for the joint probabilities. Considered variables are: wind velocity and direction, water level and significant offshore wave height and wave period. In a first step a univariate extreme value distribution has been determined for the main variables. By means of a technique based on the mean excess function, an appropriate member of the GPD is selected. An optimal threshold for peak over threshold selection is determined by maximum likelihood optimization. Next, the joint dependency structure for the primary random variables is modeled by an extreme value copula. Eventually the multivariate domain of variables was stratified in different classes, each of which representing a combination of variable quantiles with a joint probability, which are used for model simulation. The main variable is the wind velocity, as in the area of concern extreme wave conditions are wind driven. The analysis is repeated for 9 different wind directions. The secondary variable is water level. In shallow waters extreme waves will be directly affected by water depth. Hence the joint probability of occurrence for water level and wave height is of major importance for design of coastal defense structures. Wind velocity and water levels are only dependent for some wind directions (wind induced setup). Dependent directions are detected using a Kendall and Spearman test and appeared to be those with the longest fetch. For these directions, wind velocity and water level extreme value distributions are multivariately linked through a Gumbel Copula. These distributions are stratified into classes of which the frequency of occurrence can be calculated. For the remaining directions the univariate extreme wind velocity distribution is stratified, each class combined with 5 high water levels. The wave height at the model boundaries was taken into account by a regression with the extreme wind velocity at the offshore location. The regression line and the 95% confidence limits where combined with each class. Eventually the wave period is computed by a new regression with the significant wave height. This way 1103 synthetic events were selected and simulated with the SWAN wave model, each of which a frequency of occurrence is calculated for. Hence near shore significant wave heights are obtained with corresponding frequencies. The statistical distribution of the near shore wave heights is determined by sorting the model results in a descending order and accumulating the corresponding frequencies. This approach allows determination of conditional return periods. For example, for the imposed univariate design return periods of 100 years for significant wave height and 30 years for water level, the joint return period for a simultaneous exceedance of both conditions can be computed as 4000 years. Hence, this methodology allows for a probabilistic design of coastal defense structures.

  11. Statistical Maps of Ground Magnetic Disturbance Derived from Global Geospace Models

    NASA Astrophysics Data System (ADS)

    Rigler, E. J.; Wiltberger, M. J.; Love, J. J.

    2017-12-01

    Electric currents in space are the principal driver of magnetic variations measured at Earth's surface. These in turn induce geoelectric fields that present a natural hazard for technological systems like high-voltage power distribution networks. Modern global geospace models can reasonably simulate large-scale geomagnetic response to solar wind variations, but they are less successful at deterministic predictions of intense localized geomagnetic activity that most impacts technological systems on the ground. Still, recent studies have shown that these models can accurately reproduce the spatial statistical distributions of geomagnetic activity, suggesting that their physics are largely correct. Since the magnetosphere is a largely externally driven system, most model-measurement discrepancies probably arise from uncertain boundary conditions. So, with realistic distributions of solar wind parameters to establish its boundary conditions, we use the Lyon-Fedder-Mobarry (LFM) geospace model to build a synthetic multivariate statistical model of gridded ground magnetic disturbance. From this, we analyze the spatial modes of geomagnetic response, regress on available measurements to fill in unsampled locations on the grid, and estimate the global probability distribution of extreme magnetic disturbance. The latter offers a prototype geomagnetic "hazard map", similar to those used to characterize better-known geophysical hazards like earthquakes and floods.

  12. Comparison of different statistical methods for estimation of extreme sea levels with wave set-up contribution

    NASA Astrophysics Data System (ADS)

    Kergadallan, Xavier; Bernardara, Pietro; Benoit, Michel; Andreewsky, Marc; Weiss, Jérôme

    2013-04-01

    Estimating the probability of occurrence of extreme sea levels is a central issue for the protection of the coast. Return periods of sea level with wave set-up contribution are estimated here in one site : Cherbourg in France in the English Channel. The methodology follows two steps : the first one is computation of joint probability of simultaneous wave height and still sea level, the second one is interpretation of that joint probabilities to assess a sea level for a given return period. Two different approaches were evaluated to compute joint probability of simultaneous wave height and still sea level : the first one is multivariate extreme values distributions of logistic type in which all components of the variables become large simultaneously, the second one is conditional approach for multivariate extreme values in which only one component of the variables have to be large. Two different methods were applied to estimate sea level with wave set-up contribution for a given return period : Monte-Carlo simulation in which estimation is more accurate but needs higher calculation time and classical ocean engineering design contours of type inverse-FORM in which the method is simpler and allows more complex estimation of wave setup part (wave propagation to the coast for example). We compare results from the two different approaches with the two different methods. To be able to use both Monte-Carlo simulation and design contours methods, wave setup is estimated with an simple empirical formula. We show advantages of the conditional approach compared to the multivariate extreme values approach when extreme sea-level occurs when either surge or wave height is large. We discuss the validity of the ocean engineering design contours method which is an alternative when computation of sea levels is too complex to use Monte-Carlo simulation method.

  13. Generation of a Multivariate Distribution for Specified Univariate Marginals and Covariance Structure.

    DTIC Science & Technology

    1981-05-28

    x2-+,(y) . F7. y1 (-i! Tf1 ’ 2) 2 f. 2;X (3 1(2 X T= t I (Y) F= I (yI) .(-l I f ill ; X ( 2 ) Combining the one-to-one property of Tf, with (B-4...transformation of probability, dT; (y) g(Y) - f( Tf1 (Y)). ldet ( dY )I = f(TfI (Y))/f(Tf (Y)) =1 Remarks m 1. It immediately follows that Y is uniformly

  14. Characterizations of linear sufficient statistics

    NASA Technical Reports Server (NTRS)

    Peters, B. C., Jr.; Reoner, R.; Decell, H. P., Jr.

    1977-01-01

    A surjective bounded linear operator T from a Banach space X to a Banach space Y must be a sufficient statistic for a dominated family of probability measures defined on the Borel sets of X. These results were applied, so that they characterize linear sufficient statistics for families of the exponential type, including as special cases the Wishart and multivariate normal distributions. The latter result was used to establish precisely which procedures for sampling from a normal population had the property that the sample mean was a sufficient statistic.

  15. Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis.

    PubMed

    Chiba, Tomoaki; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

    2017-01-01

    In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group's sales beat GM's sales, which is a reasonable scenario.

  16. Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis

    PubMed Central

    Chiba, Tomoaki; Akaho, Shotaro; Murata, Noboru

    2017-01-01

    In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group’s sales beat GM’s sales, which is a reasonable scenario. PMID:28076383

  17. Ordinary chondrites - Multivariate statistical analysis of trace element contents

    NASA Technical Reports Server (NTRS)

    Lipschutz, Michael E.; Samuels, Stephen M.

    1991-01-01

    The contents of mobile trace elements (Co, Au, Sb, Ga, Se, Rb, Cs, Te, Bi, Ag, In, Tl, Zn, and Cd) in Antarctic and non-Antarctic populations of H4-6 and L4-6 chondrites, were compared using standard multivariate discriminant functions borrowed from linear discriminant analysis and logistic regression. A nonstandard randomization-simulation method was developed, making it possible to carry out probability assignments on a distribution-free basis. Compositional differences were found both between the Antarctic and non-Antarctic H4-6 chondrite populations and between two L4-6 chondrite populations. It is shown that, for various types of meteorites (in particular, for the H4-6 chondrites), the Antarctic/non-Antarctic compositional difference is due to preterrestrial differences in the genesis of their parent materials.

  18. Photon event distribution sampling: an image formation technique for scanning microscopes that permits tracking of sub-diffraction particles with high spatial and temporal resolutions.

    PubMed

    Larkin, J D; Publicover, N G; Sutko, J L

    2011-01-01

    In photon event distribution sampling, an image formation technique for scanning microscopes, the maximum likelihood position of origin of each detected photon is acquired as a data set rather than binning photons in pixels. Subsequently, an intensity-related probability density function describing the uncertainty associated with the photon position measurement is applied to each position and individual photon intensity distributions are summed to form an image. Compared to pixel-based images, photon event distribution sampling images exhibit increased signal-to-noise and comparable spatial resolution. Photon event distribution sampling is superior to pixel-based image formation in recognizing the presence of structured (non-random) photon distributions at low photon counts and permits use of non-raster scanning patterns. A photon event distribution sampling based method for localizing single particles derived from a multi-variate normal distribution is more precise than statistical (Gaussian) fitting to pixel-based images. Using the multi-variate normal distribution method, non-raster scanning and a typical confocal microscope, localizations with 8 nm precision were achieved at 10 ms sampling rates with acquisition of ~200 photons per frame. Single nanometre precision was obtained with a greater number of photons per frame. In summary, photon event distribution sampling provides an efficient way to form images when low numbers of photons are involved and permits particle tracking with confocal point-scanning microscopes with nanometre precision deep within specimens. © 2010 The Authors Journal of Microscopy © 2010 The Royal Microscopical Society.

  19. Learning multivariate distributions by competitive assembly of marginals.

    PubMed

    Sánchez-Vega, Francisco; Younes, Laurent; Geman, Donald

    2013-02-01

    We present a new framework for learning high-dimensional multivariate probability distributions from estimated marginals. The approach is motivated by compositional models and Bayesian networks, and designed to adapt to small sample sizes. We start with a large, overlapping set of elementary statistical building blocks, or "primitives," which are low-dimensional marginal distributions learned from data. Each variable may appear in many primitives. Subsets of primitives are combined in a Lego-like fashion to construct a probabilistic graphical model; only a small fraction of the primitives will participate in any valid construction. Since primitives can be precomputed, parameter estimation and structure search are separated. Model complexity is controlled by strong biases; we adapt the primitives to the amount of training data and impose rules which restrict the merging of them into allowable compositions. The likelihood of the data decomposes into a sum of local gains, one for each primitive in the final structure. We focus on a specific subclass of networks which are binary forests. Structure optimization corresponds to an integer linear program and the maximizing composition can be computed for reasonably large numbers of variables. Performance is evaluated using both synthetic data and real datasets from natural language processing and computational biology.

  20. Ellipsoids for anomaly detection in remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Grosklos, Guenchik; Theiler, James

    2015-05-01

    For many target and anomaly detection algorithms, a key step is the estimation of a centroid (relatively easy) and a covariance matrix (somewhat harder) that characterize the background clutter. For a background that can be modeled as a multivariate Gaussian, the centroid and covariance lead to an explicit probability density function that can be used in likelihood ratio tests for optimal detection statistics. But ellipsoidal contours can characterize a much larger class of multivariate density function, and the ellipsoids that characterize the outer periphery of the distribution are most appropriate for detection in the low false alarm rate regime. Traditionally the sample mean and sample covariance are used to estimate ellipsoid location and shape, but these quantities are confounded both by large lever-arm outliers and non-Gaussian distributions within the ellipsoid of interest. This paper compares a variety of centroid and covariance estimation schemes with the aim of characterizing the periphery of the background distribution. In particular, we will consider a robust variant of the Khachiyan algorithm for minimum-volume enclosing ellipsoid. The performance of these different approaches is evaluated on multispectral and hyperspectral remote sensing imagery using coverage plots of ellipsoid volume versus false alarm rate.

  1. A Bayesian approach to microwave precipitation profile retrieval

    NASA Technical Reports Server (NTRS)

    Evans, K. Franklin; Turk, Joseph; Wong, Takmeng; Stephens, Graeme L.

    1995-01-01

    A multichannel passive microwave precipitation retrieval algorithm is developed. Bayes theorem is used to combine statistical information from numerical cloud models with forward radiative transfer modeling. A multivariate lognormal prior probability distribution contains the covariance information about hydrometeor distribution that resolves the nonuniqueness inherent in the inversion process. Hydrometeor profiles are retrieved by maximizing the posterior probability density for each vector of observations. The hydrometeor profile retrieval method is tested with data from the Advanced Microwave Precipitation Radiometer (10, 19, 37, and 85 GHz) of convection over ocean and land in Florida. The CP-2 multiparameter radar data are used to verify the retrieved profiles. The results show that the method can retrieve approximate hydrometeor profiles, with larger errors over land than water. There is considerably greater accuracy in the retrieval of integrated hydrometeor contents than of profiles. Many of the retrieval errors are traced to problems with the cloud model microphysical information, and future improvements to the algorithm are suggested.

  2. An open source multivariate framework for n-tissue segmentation with evaluation on public data.

    PubMed

    Avants, Brian B; Tustison, Nicholas J; Wu, Jue; Cook, Philip A; Gee, James C

    2011-12-01

    We introduce Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs ( http://www.picsl.upenn.edu/ANTs). The Bayesian formulation of the segmentation problem is solved using the Expectation Maximization (EM) algorithm with the modeling of the class intensities based on either parametric or non-parametric finite mixtures. Atropos is capable of incorporating spatial prior probability maps (sparse), prior label maps and/or Markov Random Field (MRF) modeling. Atropos has also been efficiently implemented to handle large quantities of possible labelings (in the experimental section, we use up to 69 classes) with a minimal memory footprint. This work describes the technical and implementation aspects of Atropos and evaluates its performance on two different ground-truth datasets. First, we use the BrainWeb dataset from Montreal Neurological Institute to evaluate three-tissue segmentation performance via (1) K-means segmentation without use of template data; (2) MRF segmentation with initialization by prior probability maps derived from a group template; (3) Prior-based segmentation with use of spatial prior probability maps derived from a group template. We also evaluate Atropos performance by using spatial priors to drive a 69-class EM segmentation problem derived from the Hammers atlas from University College London. These evaluation studies, combined with illustrative examples that exercise Atropos options, demonstrate both performance and wide applicability of this new platform-independent open source segmentation tool.

  3. An Open Source Multivariate Framework for n-Tissue Segmentation with Evaluation on Public Data

    PubMed Central

    Tustison, Nicholas J.; Wu, Jue; Cook, Philip A.; Gee, James C.

    2012-01-01

    We introduce Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs (http://www.picsl.upenn.edu/ANTs). The Bayesian formulation of the segmentation problem is solved using the Expectation Maximization (EM) algorithm with the modeling of the class intensities based on either parametric or non-parametric finite mixtures. Atropos is capable of incorporating spatial prior probability maps (sparse), prior label maps and/or Markov Random Field (MRF) modeling. Atropos has also been efficiently implemented to handle large quantities of possible labelings (in the experimental section, we use up to 69 classes) with a minimal memory footprint. This work describes the technical and implementation aspects of Atropos and evaluates its performance on two different ground-truth datasets. First, we use the BrainWeb dataset from Montreal Neurological Institute to evaluate three-tissue segmentation performance via (1) K-means segmentation without use of template data; (2) MRF segmentation with initialization by prior probability maps derived from a group template; (3) Prior-based segmentation with use of spatial prior probability maps derived from a group template. We also evaluate Atropos performance by using spatial priors to drive a 69-class EM segmentation problem derived from the Hammers atlas from University College London. These evaluation studies, combined with illustrative examples that exercise Atropos options, demonstrate both performance and wide applicability of this new platform-independent open source segmentation tool. PMID:21373993

  4. Robust LOD scores for variance component-based linkage analysis.

    PubMed

    Blangero, J; Williams, J T; Almasy, L

    2000-01-01

    The variance component method is now widely used for linkage analysis of quantitative traits. Although this approach offers many advantages, the importance of the underlying assumption of multivariate normality of the trait distribution within pedigrees has not been studied extensively. Simulation studies have shown that traits with leptokurtic distributions yield linkage test statistics that exhibit excessive Type I error when analyzed naively. We derive analytical formulae relating the deviation from the expected asymptotic distribution of the lod score to the kurtosis and total heritability of the quantitative trait. A simple correction constant yields a robust lod score for any deviation from normality and for any pedigree structure, and effectively eliminates the problem of inflated Type I error due to misspecification of the underlying probability model in variance component-based linkage analysis.

  5. Multivariable normal tissue complication probability model-based treatment plan optimization for grade 2-4 dysphagia and tube feeding dependence in head and neck radiotherapy.

    PubMed

    Kierkels, Roel G J; Wopken, Kim; Visser, Ruurd; Korevaar, Erik W; van der Schaaf, Arjen; Bijl, Hendrik P; Langendijk, Johannes A

    2016-12-01

    Radiotherapy of the head and neck is challenged by the relatively large number of organs-at-risk close to the tumor. Biologically-oriented objective functions (OF) could optimally distribute the dose among the organs-at-risk. We aimed to explore OFs based on multivariable normal tissue complication probability (NTCP) models for grade 2-4 dysphagia (DYS) and tube feeding dependence (TFD). One hundred head and neck cancer patients were studied. Additional to the clinical plan, two more plans (an OF DYS and OF TFD -plan) were optimized per patient. The NTCP models included up to four dose-volume parameters and other non-dosimetric factors. A fully automatic plan optimization framework was used to optimize the OF NTCP -based plans. All OF NTCP -based plans were reviewed and classified as clinically acceptable. On average, the Δdose and ΔNTCP were small comparing the OF DYS -plan, OF TFD -plan, and clinical plan. For 5% of patients NTCP TFD reduced >5% using OF TFD -based planning compared to the OF DYS -plans. Plan optimization using NTCP DYS - and NTCP TFD -based objective functions resulted in clinically acceptable plans. For patients with considerable risk factors of TFD, the OF TFD steered the optimizer to dose distributions which directly led to slightly lower predicted NTCP TFD values as compared to the other studied plans. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Opportunities for multivariate analysis of open spatial datasets to characterize urban flooding risks

    NASA Astrophysics Data System (ADS)

    Gaitan, S.; ten Veldhuis, J. A. E.

    2015-06-01

    Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to reduce flooding impacts. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall, socioeconomic characteristics, and social sensing, may help to explain probability and impacts of urban flooding. Several spatial datasets have been recently made available in the Netherlands, including rainfall-related incident reports made by citizens, spatially distributed rain depths, semidistributed socioeconomic information, and buildings age. Inspecting the potential of this data to explain the occurrence of rainfall related incidents has not been done yet. Multivariate analysis tools for describing communities and environmental patterns have been previously developed and used in the field of study of ecology. The objective of this paper is to outline opportunities for these tools to explore urban flooding risks patterns in the mentioned datasets. To that end, a cluster analysis is performed. Results indicate that incidence of rainfall-related impacts is higher in areas characterized by older infrastructure and higher population density.

  7. Shape Control in Multivariate Barycentric Rational Interpolation

    NASA Astrophysics Data System (ADS)

    Nguyen, Hoa Thang; Cuyt, Annie; Celis, Oliver Salazar

    2010-09-01

    The most stable formula for a rational interpolant for use on a finite interval is the barycentric form [1, 2]. A simple choice of the barycentric weights ensures the absence of (unwanted) poles on the real line [3]. In [4] we indicate that a more refined choice of the weights in barycentric rational interpolation can guarantee comonotonicity and coconvexity of the rational interpolant in addition to a polefree region of interest. In this presentation we generalize the above to the multivariate case. We use a product-like form of univariate barycentric rational interpolants and indicate how the location of the poles and the shape of the function can be controlled. This functionality is of importance in the construction of mathematical models that need to express a certain trend, such as in probability distributions, economics, population dynamics, tumor growth models etc.

  8. A Stochastic Diffusion Process for the Dirichlet Distribution

    DOE PAGES

    Bakosi, J.; Ristorcelli, J. R.

    2013-03-01

    The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability ofNcoupled stochastic variables with the Dirichlet distribution as its asymptotic solution. To ensure a bounded sample space, a coupled nonlinear diffusion process is required: the Wiener processes in the equivalent system of stochastic differential equations are multiplicative with coefficients dependent on all the stochastic variables. Individual samples of a discrete ensemble, obtained from the stochastic process, satisfy a unit-sum constraint at all times. The process may be used to represent realizations of a fluctuating ensemble ofNvariables subject to a conservation principle.more » Similar to the multivariate Wright-Fisher process, whose invariant is also Dirichlet, the univariate case yields a process whose invariant is the beta distribution. As a test of the results, Monte Carlo simulations are used to evolve numerical ensembles toward the invariant Dirichlet distribution.« less

  9. Unified theory for stochastic modelling of hydroclimatic processes: Preserving marginal distributions, correlation structures, and intermittency

    NASA Astrophysics Data System (ADS)

    Papalexiou, Simon Michael

    2018-05-01

    Hydroclimatic processes come in all "shapes and sizes". They are characterized by different spatiotemporal correlation structures and probability distributions that can be continuous, mixed-type, discrete or even binary. Simulating such processes by reproducing precisely their marginal distribution and linear correlation structure, including features like intermittency, can greatly improve hydrological analysis and design. Traditionally, modelling schemes are case specific and typically attempt to preserve few statistical moments providing inadequate and potentially risky distribution approximations. Here, a single framework is proposed that unifies, extends, and improves a general-purpose modelling strategy, based on the assumption that any process can emerge by transforming a specific "parent" Gaussian process. A novel mathematical representation of this scheme, introducing parametric correlation transformation functions, enables straightforward estimation of the parent-Gaussian process yielding the target process after the marginal back transformation, while it provides a general description that supersedes previous specific parameterizations, offering a simple, fast and efficient simulation procedure for every stationary process at any spatiotemporal scale. This framework, also applicable for cyclostationary and multivariate modelling, is augmented with flexible parametric correlation structures that parsimoniously describe observed correlations. Real-world simulations of various hydroclimatic processes with different correlation structures and marginals, such as precipitation, river discharge, wind speed, humidity, extreme events per year, etc., as well as a multivariate example, highlight the flexibility, advantages, and complete generality of the method.

  10. Exact Scheffé-type confidence intervals for output from groundwater flow models: 1. Use of hydrogeologic information

    USGS Publications Warehouse

    Cooley, Richard L.

    1993-01-01

    A new method is developed to efficiently compute exact Scheffé-type confidence intervals for output (or other function of parameters) g(β) derived from a groundwater flow model. The method is general in that parameter uncertainty can be specified by any statistical distribution having a log probability density function (log pdf) that can be expanded in a Taylor series. However, for this study parameter uncertainty is specified by a statistical multivariate beta distribution that incorporates hydrogeologic information in the form of the investigator's best estimates of parameters and a grouping of random variables representing possible parameter values so that each group is defined by maximum and minimum bounds and an ordering according to increasing value. The new method forms the confidence intervals from maximum and minimum limits of g(β) on a contour of a linear combination of (1) the quadratic form for the parameters used by Cooley and Vecchia (1987) and (2) the log pdf for the multivariate beta distribution. Three example problems are used to compare characteristics of the confidence intervals for hydraulic head obtained using different weights for the linear combination. Different weights generally produced similar confidence intervals, whereas the method of Cooley and Vecchia (1987) often produced much larger confidence intervals.

  11. Precipitation Interpolation by Multivariate Bayesian Maximum Entropy Based on Meteorological Data in Yun- Gui-Guang region, Mainland China

    NASA Astrophysics Data System (ADS)

    Wang, Chaolin; Zhong, Shaobo; Zhang, Fushen; Huang, Quanyi

    2016-11-01

    Precipitation interpolation has been a hot area of research for many years. It had close relation to meteorological factors. In this paper, precipitation from 91 meteorological stations located in and around Yunnan, Guizhou and Guangxi Zhuang provinces (or autonomous region), Mainland China was taken into consideration for spatial interpolation. Multivariate Bayesian maximum entropy (BME) method with auxiliary variables, including mean relative humidity, water vapour pressure, mean temperature, mean wind speed and terrain elevation, was used to get more accurate regional distribution of annual precipitation. The means, standard deviations, skewness and kurtosis of meteorological factors were calculated. Variogram and cross- variogram were fitted between precipitation and auxiliary variables. The results showed that the multivariate BME method was precise with hard and soft data, probability density function. Annual mean precipitation was positively correlated with mean relative humidity, mean water vapour pressure, mean temperature and mean wind speed, negatively correlated with terrain elevation. The results are supposed to provide substantial reference for research of drought and waterlog in the region.

  12. Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions versus Multivariate Polytomous Ability Distributions. Research Report. ETS RR-08-45

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; von Davier, Matthias; Lee, Yi-Hsuan

    2008-01-01

    Multidimensional item response models can be based on multivariate normal ability distributions or on multivariate polytomous ability distributions. For the case of simple structure in which each item corresponds to a unique dimension of the ability vector, some applications of the two-parameter logistic model to empirical data are employed to…

  13. Latin hypercube approach to estimate uncertainty in ground water vulnerability

    USGS Publications Warehouse

    Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

    2007-01-01

    A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.

  14. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

    PubMed Central

    Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260

  15. Latin Hypercube Sampling (LHS) UNIX Library/Standalone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2004-05-13

    The LHS UNIX Library/Standalone software provides the capability to draw random samples from over 30 distribution types. It performs the sampling by a stratified sampling method called Latin Hypercube Sampling (LHS). Multiple distributions can be sampled simultaneously, with user-specified correlations amongst the input distributions, LHS UNIX Library/ Standalone provides a way to generate multi-variate samples. The LHS samples can be generated either as a callable library (e.g., from within the DAKOTA software framework) or as a standalone capability. LHS UNIX Library/Standalone uses the Latin Hypercube Sampling method (LHS) to generate samples. LHS is a constrained Monte Carlo sampling scheme. Inmore » LHS, the range of each variable is divided into non-overlapping intervals on the basis of equal probability. A sample is selected at random with respect to the probability density in each interval, If multiple variables are sampled simultaneously, then values obtained for each are paired in a random manner with the n values of the other variables. In some cases, the pairing is restricted to obtain specified correlations amongst the input variables. Many simulation codes have input parameters that are uncertain and can be specified by a distribution, To perform uncertainty analysis and sensitivity analysis, random values are drawn from the input parameter distributions, and the simulation is run with these values to obtain output values. If this is done repeatedly, with many input samples drawn, one can build up a distribution of the output as well as examine correlations between input and output variables.« less

  16. Remote sensing of earth terrain

    NASA Technical Reports Server (NTRS)

    Kong, J. A.

    1988-01-01

    Two monographs and 85 journal and conference papers on remote sensing of earth terrain have been published, sponsored by NASA Contract NAG5-270. A multivariate K-distribution is proposed to model the statistics of fully polarimetric data from earth terrain with polarizations HH, HV, VH, and VV. In this approach, correlated polarizations of radar signals, as characterized by a covariance matrix, are treated as the sum of N n-dimensional random vectors; N obeys the negative binomial distribution with a parameter alpha and mean bar N. Subsequently, and n-dimensional K-distribution, with either zero or non-zero mean, is developed in the limit of infinite bar N or illuminated area. The probability density function (PDF) of the K-distributed vector normalized by its Euclidean norm is independent of the parameter alpha and is the same as that derived from a zero-mean Gaussian-distributed random vector. The above model is well supported by experimental data provided by MIT Lincoln Laboratory and the Jet Propulsion Laboratory in the form of polarimetric measurements.

  17. Some Recent Developments on Complex Multivariate Distributions

    ERIC Educational Resources Information Center

    Krishnaiah, P. R.

    1976-01-01

    In this paper, the author gives a review of the literature on complex multivariate distributions. Some new results on these distributions are also given. Finally, the author discusses the applications of the complex multivariate distributions in the area of the inference on multiple time series. (Author)

  18. Multivariate normal maximum likelihood with both ordinal and continuous variables, and data missing at random.

    PubMed

    Pritikin, Joshua N; Brick, Timothy R; Neale, Michael C

    2018-04-01

    A novel method for the maximum likelihood estimation of structural equation models (SEM) with both ordinal and continuous indicators is introduced using a flexible multivariate probit model for the ordinal indicators. A full information approach ensures unbiased estimates for data missing at random. Exceeding the capability of prior methods, up to 13 ordinal variables can be included before integration time increases beyond 1 s per row. The method relies on the axiom of conditional probability to split apart the distribution of continuous and ordinal variables. Due to the symmetry of the axiom, two similar methods are available. A simulation study provides evidence that the two similar approaches offer equal accuracy. A further simulation is used to develop a heuristic to automatically select the most computationally efficient approach. Joint ordinal continuous SEM is implemented in OpenMx, free and open-source software.

  19. Trend Detection and Bivariate Frequency Analysis for Nonstrationary Rainfall Data

    NASA Astrophysics Data System (ADS)

    Joo, K.; Kim, H.; Shin, J. Y.; Heo, J. H.

    2017-12-01

    Multivariate frequency analysis has been developing for hydro-meteorological data such as rainfall, flood, and drought. Particularly, the copula has been used as a useful tool for multivariate probability model which has no limitation on deciding marginal distributions. The time-series rainfall data can be characterized to rainfall event by inter-event time definition (IETD) and each rainfall event has a rainfall depth and rainfall duration. In addition, nonstationarity in rainfall event has been studied recently due to climate change and trend detection of rainfall event is important to determine the data has nonstationarity or not. With the rainfall depth and duration of a rainfall event, trend detection and nonstationary bivariate frequency analysis has performed in this study. 62 stations from Korea Meteorological Association (KMA) over 30 years of hourly recorded data used in this study and the suitability of nonstationary copula for rainfall event has examined by the goodness-of-fit test.

  20. Climate Change Impact Assessment in Pacific North West Using Copula based Coupling of Temperature and Precipitation variables

    NASA Astrophysics Data System (ADS)

    Qin, Y.; Rana, A.; Moradkhani, H.

    2014-12-01

    The multi downscaled-scenario products allow us to better assess the uncertainty of the changes/variations of precipitation and temperature in the current and future periods. Joint Probability distribution functions (PDFs), of both the climatic variables, might help better understand the interdependence of the two, and thus in-turn help in accessing the future with confidence. Using the joint distribution of temperature and precipitation is also of significant importance in hydrological applications and climate change studies. In the present study, we have used multi-modelled statistically downscaled-scenario ensemble of precipitation and temperature variables using 2 different statistically downscaled climate dataset. The datasets used are, 10 Global Climate Models (GCMs) downscaled products from CMIP5 daily dataset, namely, those from the Bias Correction and Spatial Downscaling (BCSD) technique generated at Portland State University and from the Multivariate Adaptive Constructed Analogs (MACA) technique, generated at University of Idaho, leading to 2 ensemble time series from 20 GCM products. Thereafter the ensemble PDFs of both precipitation and temperature is evaluated for summer, winter, and yearly periods for all the 10 sub-basins across Columbia River Basin (CRB). Eventually, Copula is applied to establish the joint distribution of two variables enabling users to model the joint behavior of the variables with any level of correlation and dependency. Moreover, the probabilistic distribution helps remove the limitations on marginal distributions of variables in question. The joint distribution is then used to estimate the change trends of the joint precipitation and temperature in the current and future, along with estimation of the probabilities of the given change. Results have indicated towards varied change trends of the joint distribution of, summer, winter, and yearly time scale, respectively in all 10 sub-basins. Probabilities of changes, as estimated by the joint precipitation and temperature, will provide useful information/insights for hydrological and climate change predictions.

  1. Application of Monte Carlo Method for Evaluation of Uncertainties of ITS-90 by Standard Platinum Resistance Thermometer

    NASA Astrophysics Data System (ADS)

    Palenčár, Rudolf; Sopkuliak, Peter; Palenčár, Jakub; Ďuriš, Stanislav; Suroviak, Emil; Halaj, Martin

    2017-06-01

    Evaluation of uncertainties of the temperature measurement by standard platinum resistance thermometer calibrated at the defining fixed points according to ITS-90 is a problem that can be solved in different ways. The paper presents a procedure based on the propagation of distributions using the Monte Carlo method. The procedure employs generation of pseudo-random numbers for the input variables of resistances at the defining fixed points, supposing the multivariate Gaussian distribution for input quantities. This allows taking into account the correlations among resistances at the defining fixed points. Assumption of Gaussian probability density function is acceptable, with respect to the several sources of uncertainties of resistances. In the case of uncorrelated resistances at the defining fixed points, the method is applicable to any probability density function. Validation of the law of propagation of uncertainty using the Monte Carlo method is presented on the example of specific data for 25 Ω standard platinum resistance thermometer in the temperature range from 0 to 660 °C. Using this example, we demonstrate suitability of the method by validation of its results.

  2. A new multivariate zero-adjusted Poisson model with applications to biomedicine.

    PubMed

    Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

    2018-05-25

    Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution.

    PubMed

    Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

    2017-01-01

    The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section.

  4. Structural brain connectivity and cognitive ability differences: A multivariate distance matrix regression analysis.

    PubMed

    Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto

    2017-02-01

    Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  5. Multivariate moment closure techniques for stochastic kinetic models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lakatos, Eszter, E-mail: e.lakatos13@imperial.ac.uk; Ale, Angelique; Kirk, Paul D. W.

    2015-09-07

    Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporallymore » evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs.« less

  6. A Robust Bayesian Approach for Structural Equation Models with Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum; Xia, Ye-Mao

    2008-01-01

    In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…

  7. A simple rapid approach using coupled multivariate statistical methods, GIS and trajectory models to delineate areas of common oil spill risk

    NASA Astrophysics Data System (ADS)

    Guillen, George; Rainey, Gail; Morin, Michelle

    2004-04-01

    Currently, the Minerals Management Service uses the Oil Spill Risk Analysis model (OSRAM) to predict the movement of potential oil spills greater than 1000 bbl originating from offshore oil and gas facilities. OSRAM generates oil spill trajectories using meteorological and hydrological data input from either actual physical measurements or estimates generated from other hydrological models. OSRAM and many other models produce output matrices of average, maximum and minimum contact probabilities to specific landfall or target segments (columns) from oil spills at specific points (rows). Analysts and managers are often interested in identifying geographic areas or groups of facilities that pose similar risks to specific targets or groups of targets if a spill occurred. Unfortunately, due to the potentially large matrix generated by many spill models, this question is difficult to answer without the use of data reduction and visualization methods. In our study we utilized a multivariate statistical method called cluster analysis to group areas of similar risk based on potential distribution of landfall target trajectory probabilities. We also utilized ArcView™ GIS to display spill launch point groupings. The combination of GIS and multivariate statistical techniques in the post-processing of trajectory model output is a powerful tool for identifying and delineating areas of similar risk from multiple spill sources. We strongly encourage modelers, statistical and GIS software programmers to closely collaborate to produce a more seamless integration of these technologies and approaches to analyzing data. They are complimentary methods that strengthen the overall assessment of spill risks.

  8. Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction.

    PubMed

    Soleimani, Hossein; Hensman, James; Saria, Suchi

    2017-08-21

    Missing data and noisy observations pose significant challenges for reliably predicting events from irregularly sampled multivariate time series (longitudinal) data. Imputation methods, which are typically used for completing the data prior to event prediction, lack a principled mechanism to account for the uncertainty due to missingness. Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations. These approaches, however, make strong parametric assumptions and do not easily scale to multivariate signals with many observations. Our proposed approach consists of several key innovations. First, we develop a flexible and scalable joint model based upon sparse multiple-output Gaussian processes. Unlike state-of-the-art joint models, the proposed model can explain highly challenging structure including non-Gaussian noise while scaling to large data. Second, we derive an optimal policy for predicting events using the distribution of the event occurrence estimated by the joint model. The derived policy trades-off the cost of a delayed detection versus incorrect assessments and abstains from making decisions when the estimated event probability does not satisfy the derived confidence criteria. Experiments on a large dataset show that the proposed framework significantly outperforms state-of-the-art techniques in event prediction.

  9. A New Multivariate Approach in Generating Ensemble Meteorological Forcings for Hydrological Forecasting

    NASA Astrophysics Data System (ADS)

    Khajehei, Sepideh; Moradkhani, Hamid

    2015-04-01

    Producing reliable and accurate hydrologic ensemble forecasts are subject to various sources of uncertainty, including meteorological forcing, initial conditions, model structure, and model parameters. Producing reliable and skillful precipitation ensemble forecasts is one approach to reduce the total uncertainty in hydrological applications. Currently, National Weather Prediction (NWP) models are developing ensemble forecasts for various temporal ranges. It is proven that raw products from NWP models are biased in mean and spread. Given the above state, there is a need for methods that are able to generate reliable ensemble forecasts for hydrological applications. One of the common techniques is to apply statistical procedures in order to generate ensemble forecast from NWP-generated single-value forecasts. The procedure is based on the bivariate probability distribution between the observation and single-value precipitation forecast. However, one of the assumptions of the current method is fitting Gaussian distribution to the marginal distributions of observed and modeled climate variable. Here, we have described and evaluated a Bayesian approach based on Copula functions to develop an ensemble precipitation forecast from the conditional distribution of single-value precipitation forecasts. Copula functions are known as the multivariate joint distribution of univariate marginal distributions, which are presented as an alternative procedure in capturing the uncertainties related to meteorological forcing. Copulas are capable of modeling the joint distribution of two variables with any level of correlation and dependency. This study is conducted over a sub-basin in the Columbia River Basin in USA using the monthly precipitation forecasts from Climate Forecast System (CFS) with 0.5x0.5 Deg. spatial resolution to reproduce the observations. The verification is conducted on a different period and the superiority of the procedure is compared with Ensemble Pre-Processor approach currently used by National Weather Service River Forecast Centers in USA.

  10. Optimal allocation of testing resources for statistical simulations

    NASA Astrophysics Data System (ADS)

    Quintana, Carolina; Millwater, Harry R.; Singh, Gulshan; Golden, Patrick

    2015-07-01

    Statistical estimates from simulation involve uncertainty caused by the variability in the input random variables due to limited data. Allocating resources to obtain more experimental data of the input variables to better characterize their probability distributions can reduce the variance of statistical estimates. The methodology proposed determines the optimal number of additional experiments required to minimize the variance of the output moments given single or multiple constraints. The method uses multivariate t-distribution and Wishart distribution to generate realizations of the population mean and covariance of the input variables, respectively, given an amount of available data. This method handles independent and correlated random variables. A particle swarm method is used for the optimization. The optimal number of additional experiments per variable depends on the number and variance of the initial data, the influence of the variable in the output function and the cost of each additional experiment. The methodology is demonstrated using a fretting fatigue example.

  11. An Alternative Method for Computing Mean and Covariance Matrix of Some Multivariate Distributions

    ERIC Educational Resources Information Center

    Radhakrishnan, R.; Choudhury, Askar

    2009-01-01

    Computing the mean and covariance matrix of some multivariate distributions, in particular, multivariate normal distribution and Wishart distribution are considered in this article. It involves a matrix transformation of the normal random vector into a random vector whose components are independent normal random variables, and then integrating…

  12. Multivariate Statistical Modelling of Drought and Heat Wave Events

    NASA Astrophysics Data System (ADS)

    Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele

    2016-04-01

    Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A copula is a multivariate distribution function which allows one to model the dependence structure of given variables separately from the marginal behaviour. We firstly look at the structure of soil moisture drought over the entire of France using the SAFRAN dataset between 1959 and 2009. Soil moisture is represented using the Standardised Precipitation Evapotranspiration Index (SPEI). Drought characteristics are computed at grid point scale where drought conditions are identified as those with an SPEI value below -1.0. We model the multivariate dependence structure of drought events defined by certain characteristics and compute return levels of these events. We initially find that drought characteristics such as duration, mean SPEI and the maximum contiguous area to a grid point all have positive correlations, though the degree to which they are correlated can vary considerably spatially. A spatial representation of return levels then may provide insight into the areas most prone to drought conditions. As a next step, we analyse the dependence structure between soil moisture conditions preceding the onset of a heat wave and the heat wave itself.

  13. Joint pattern of seasonal hydrological droughts and floods alternation in China's Huai River Basin using the multivariate L-moments

    NASA Astrophysics Data System (ADS)

    Wu, ShaoFei; Zhang, Xiang; She, DunXian

    2017-06-01

    Under the current condition of climate change, droughts and floods occur more frequently, and events in which flooding occurs after a prolonged drought or a drought occurs after an extreme flood may have a more severe impact on natural systems and human lives. This challenges the traditional approach wherein droughts and floods are considered separately, which may largely underestimate the risk of the disasters. In our study, the sudden alternation of droughts and flood events (ADFEs) between adjacent seasons is studied using the multivariate L-moments theory and the bivariate copula functions in the Huai River Basin (HRB) of China with monthly streamflow data at 32 hydrological stations from 1956 to 2012. The dry and wet conditions are characterized by the standardized streamflow index (SSI) at a 3-month time scale. The results show that: (1) The summer streamflow makes the largest contribution to the annual streamflow, followed by the autumn streamflow and spring streamflow. (2) The entire study area can be divided into five homogeneous sub-regions using the multivariate regional homogeneity test. The generalized logistic distribution (GLO) and log-normal distribution (LN3) are acceptable to be the optimal marginal distributions under most conditions, and the Frank copula is more appropriate for spring-summer and summer-autumn SSI series. Continuous flood events dominate at most sites both in spring-summer and summer-autumn (with an average frequency of 13.78% and 17.06%, respectively), while continuous drought events come second (with an average frequency of 11.27% and 13.79%, respectively). Moreover, seasonal ADFEs most probably occurred near the mainstream of HRB, and drought and flood events are more likely to occur in summer-autumn than in spring-summer.

  14. A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

    PubMed Central

    Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

    2017-01-01

    The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section. PMID:28983398

  15. Impact of distributions on the archetypes and prototypes in heterogeneous nanoparticle ensembles.

    PubMed

    Fernandez, Michael; Wilson, Hugh F; Barnard, Amanda S

    2017-01-05

    The magnitude and complexity of the structural and functional data available on nanomaterials requires data analytics, statistical analysis and information technology to drive discovery. We demonstrate that multivariate statistical analysis can recognise the sets of truly significant nanostructures and their most relevant properties in heterogeneous ensembles with different probability distributions. The prototypical and archetypal nanostructures of five virtual ensembles of Si quantum dots (SiQDs) with Boltzmann, frequency, normal, Poisson and random distributions are identified using clustering and archetypal analysis, where we find that their diversity is defined by size and shape, regardless of the type of distribution. At the complex hull of the SiQD ensembles, simple configuration archetypes can efficiently describe a large number of SiQDs, whereas more complex shapes are needed to represent the average ordering of the ensembles. This approach provides a route towards the characterisation of computationally intractable virtual nanomaterial spaces, which can convert big data into smart data, and significantly reduce the workload to simulate experimentally relevant virtual samples.

  16. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

    PubMed

    Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

    2017-03-15

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

    PubMed

    Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

    2017-04-01

    Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.

  18. Numerically stable algorithm for combining census and sample estimates with the multivariate composite estimator

    Treesearch

    R. L. Czaplewski

    2009-01-01

    The minimum variance multivariate composite estimator is a relatively simple sequential estimator for complex sampling designs (Czaplewski 2009). Such designs combine a probability sample of expensive field data with multiple censuses and/or samples of relatively inexpensive multi-sensor, multi-resolution remotely sensed data. Unfortunately, the multivariate composite...

  19. Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data.

    PubMed

    Sáez, Carlos; Robles, Montserrat; García-Gómez, Juan Miguel

    2013-01-01

    Research biobanks are often composed by data from multiple sources. In some cases, these different subsets of data may present dissimilarities among their probability density functions (PDF) due to spatial shifts. This, may lead to wrong hypothesis when treating the data as a whole. Also, the overall quality of the data is diminished. With the purpose of developing a generic and comparable metric to assess the stability of multi-source datasets, we have studied the applicability and behaviour of several PDF distances over shifts on different conditions (such as uni- and multivariate, different types of variable, and multi-modality) which may appear in real biomedical data. From the studied distances, we found information-theoretic based and Earth Mover's Distance to be the most practical distances for most conditions. We discuss the properties and usefulness of each distance according to the possible requirements of a general stability metric.

  20. A multistate dynamic site occupancy model for spatially aggregated sessile communities

    USGS Publications Warehouse

    Fukaya, Keiichi; Royle, J. Andrew; Okuda, Takehiro; Nakaoka, Masahiro; Noda, Takashi

    2017-01-01

    Estimation of transition probabilities of sessile communities seems easy in principle but may still be difficult in practice because resampling error (i.e. a failure to resample exactly the same location at fixed points) may cause significant estimation bias. Previous studies have developed novel analytical methods to correct for this estimation bias. However, they did not consider the local structure of community composition induced by the aggregated distribution of organisms that is typically observed in sessile assemblages and is very likely to affect observations.We developed a multistate dynamic site occupancy model to estimate transition probabilities that accounts for resampling errors associated with local community structure. The model applies a nonparametric multivariate kernel smoothing methodology to the latent occupancy component to estimate the local state composition near each observation point, which is assumed to determine the probability distribution of data conditional on the occurrence of resampling error.By using computer simulations, we confirmed that an observation process that depends on local community structure may bias inferences about transition probabilities. By applying the proposed model to a real data set of intertidal sessile communities, we also showed that estimates of transition probabilities and of the properties of community dynamics may differ considerably when spatial dependence is taken into account.Results suggest the importance of accounting for resampling error and local community structure for developing management plans that are based on Markovian models. Our approach provides a solution to this problem that is applicable to broad sessile communities. It can even accommodate an anisotropic spatial correlation of species composition, and may also serve as a basis for inferring complex nonlinear ecological dynamics.

  1. A model-based approach to wildland fire reconstruction using sediment charcoal records

    USGS Publications Warehouse

    Itter, Malcolm S.; Finley, Andrew O.; Hooten, Mevin B.; Higuera, Philip E.; Marlon, Jennifer R.; Kelly, Ryan; McLachlan, Jason S.

    2017-01-01

    Lake sediment charcoal records are used in paleoecological analyses to reconstruct fire history, including the identification of past wildland fires. One challenge of applying sediment charcoal records to infer fire history is the separation of charcoal associated with local fire occurrence and charcoal originating from regional fire activity. Despite a variety of methods to identify local fires from sediment charcoal records, an integrated statistical framework for fire reconstruction is lacking. We develop a Bayesian point process model to estimate the probability of fire associated with charcoal counts from individual-lake sediments and estimate mean fire return intervals. A multivariate extension of the model combines records from multiple lakes to reduce uncertainty in local fire identification and estimate a regional mean fire return interval. The univariate and multivariate models are applied to 13 lakes in the Yukon Flats region of Alaska. Both models resulted in similar mean fire return intervals (100–350 years) with reduced uncertainty under the multivariate model due to improved estimation of regional charcoal deposition. The point process model offers an integrated statistical framework for paleofire reconstruction and extends existing methods to infer regional fire history from multiple lake records with uncertainty following directly from posterior distributions.

  2. On measures of association among genetic variables

    PubMed Central

    Gianola, Daniel; Manfredi, Eduardo; Simianer, Henner

    2012-01-01

    Summary Systems involving many variables are important in population and quantitative genetics, for example, in multi-trait prediction of breeding values and in exploration of multi-locus associations. We studied departures of the joint distribution of sets of genetic variables from independence. New measures of association based on notions of statistical distance between distributions are presented. These are more general than correlations, which are pairwise measures, and lack a clear interpretation beyond the bivariate normal distribution. Our measures are based on logarithmic (Kullback-Leibler) and on relative ‘distances’ between distributions. Indexes of association are developed and illustrated for quantitative genetics settings in which the joint distribution of the variables is either multivariate normal or multivariate-t, and we show how the indexes can be used to study linkage disequilibrium in a two-locus system with multiple alleles and present applications to systems of correlated beta distributions. Two multivariate beta and multivariate beta-binomial processes are examined, and new distributions are introduced: the GMS-Sarmanov multivariate beta and its beta-binomial counterpart. PMID:22742500

  3. Stochastic static fault slip inversion from geodetic data with non-negativity and bound constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-07-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modelling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a truncated multivariate normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulae for the single, 2-D or n-D marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations. Posterior mean and covariance can also be efficiently derived. I show that the maximum posterior (MAP) can be obtained using a non-negative least-squares algorithm for the single truncated case or using the bounded-variable least-squares algorithm for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modelling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC-based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the MAP is extremely fast.

  4. Multivariate Normal Tissue Complication Probability Modeling of Heart Valve Dysfunction in Hodgkin Lymphoma Survivors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cella, Laura, E-mail: laura.cella@cnr.it; Department of Advanced Biomedical Sciences, Federico II University School of Medicine, Naples; Liuzzi, Raffaele

    Purpose: To establish a multivariate normal tissue complication probability (NTCP) model for radiation-induced asymptomatic heart valvular defects (RVD). Methods and Materials: Fifty-six patients treated with sequential chemoradiation therapy for Hodgkin lymphoma (HL) were retrospectively reviewed for RVD events. Clinical information along with whole heart, cardiac chambers, and lung dose distribution parameters was collected, and the correlations to RVD were analyzed by means of Spearman's rank correlation coefficient (Rs). For the selection of the model order and parameters for NTCP modeling, a multivariate logistic regression method using resampling techniques (bootstrapping) was applied. Model performance was evaluated using the area under themore » receiver operating characteristic curve (AUC). Results: When we analyzed the whole heart, a 3-variable NTCP model including the maximum dose, whole heart volume, and lung volume was shown to be the optimal predictive model for RVD (Rs = 0.573, P<.001, AUC = 0.83). When we analyzed the cardiac chambers individually, for the left atrium and for the left ventricle, an NTCP model based on 3 variables including the percentage volume exceeding 30 Gy (V30), cardiac chamber volume, and lung volume was selected as the most predictive model (Rs = 0.539, P<.001, AUC = 0.83; and Rs = 0.557, P<.001, AUC = 0.82, respectively). The NTCP values increase as heart maximum dose or cardiac chambers V30 increase. They also increase with larger volumes of the heart or cardiac chambers and decrease when lung volume is larger. Conclusions: We propose logistic NTCP models for RVD considering not only heart irradiation dose but also the combined effects of lung and heart volumes. Our study establishes the statistical evidence of the indirect effect of lung size on radio-induced heart toxicity.« less

  5. A framework for conducting mechanistic based reliability assessments of components operating in complex systems

    NASA Astrophysics Data System (ADS)

    Wallace, Jon Michael

    2003-10-01

    Reliability prediction of components operating in complex systems has historically been conducted in a statistically isolated manner. Current physics-based, i.e. mechanistic, component reliability approaches focus more on component-specific attributes and mathematical algorithms and not enough on the influence of the system. The result is that significant error can be introduced into the component reliability assessment process. The objective of this study is the development of a framework that infuses the needs and influence of the system into the process of conducting mechanistic-based component reliability assessments. The formulated framework consists of six primary steps. The first three steps, identification, decomposition, and synthesis, are primarily qualitative in nature and employ system reliability and safety engineering principles to construct an appropriate starting point for the component reliability assessment. The following two steps are the most unique. They involve a step to efficiently characterize and quantify the system-driven local parameter space and a subsequent step using this information to guide the reduction of the component parameter space. The local statistical space quantification step is accomplished using two proposed multivariate probability models: Multi-Response First Order Second Moment and Taylor-Based Inverse Transformation. Where existing joint probability models require preliminary distribution and correlation information of the responses, these models combine statistical information of the input parameters with an efficient sampling of the response analyses to produce the multi-response joint probability distribution. Parameter space reduction is accomplished using Approximate Canonical Correlation Analysis (ACCA) employed as a multi-response screening technique. The novelty of this approach is that each individual local parameter and even subsets of parameters representing entire contributing analyses can now be rank ordered with respect to their contribution to not just one response, but the entire vector of component responses simultaneously. The final step of the framework is the actual probabilistic assessment of the component. Although the same multivariate probability tools employed in the characterization step can be used for the component probability assessment, variations of this final step are given to allow for the utilization of existing probabilistic methods such as response surface Monte Carlo and Fast Probability Integration. The overall framework developed in this study is implemented to assess the finite-element based reliability prediction of a gas turbine airfoil involving several failure responses. Results of this implementation are compared to results generated using the conventional 'isolated' approach as well as a validation approach conducted through large sample Monte Carlo simulations. The framework resulted in a considerable improvement to the accuracy of the part reliability assessment and an improved understanding of the component failure behavior. Considerable statistical complexity in the form of joint non-normal behavior was found and accounted for using the framework. Future applications of the framework elements are discussed.

  6. Multivariate Density Estimation and Remote Sensing

    NASA Technical Reports Server (NTRS)

    Scott, D. W.

    1983-01-01

    Current efforts to develop methods and computer algorithms to effectively represent multivariate data commonly encountered in remote sensing applications are described. While this may involve scatter diagrams, multivariate representations of nonparametric probability density estimates are emphasized. The density function provides a useful graphical tool for looking at data and a useful theoretical tool for classification. This approach is called a thunderstorm data analysis.

  7. Extracting chemical information from high-resolution Kβ X-ray emission spectroscopy

    NASA Astrophysics Data System (ADS)

    Limandri, S.; Robledo, J.; Tirao, G.

    2018-06-01

    High-resolution X-ray emission spectroscopy allows studying the chemical environment of a wide variety of materials. Chemical information can be obtained by fitting the X-ray spectra and observing the behavior of some spectral features. Spectral changes can also be quantified by means of statistical parameters calculated by considering the spectrum as a probability distribution. Another possibility is to perform statistical multivariate analysis, such as principal component analysis. In this work the performance of these procedures for extracting chemical information in X-ray emission spectroscopy spectra for mixtures of Mn2+ and Mn4+ oxides are studied. A detail analysis of the parameters obtained, as well as the associated uncertainties is shown. The methodologies are also applied for Mn oxidation state characterization of double perovskite oxides Ba1+xLa1-xMnSbO6 (with 0 ≤ x ≤ 0.7). The results show that statistical parameters and multivariate analysis are the most suitable for the analysis of this kind of spectra.

  8. Multi-variate joint PDF for non-Gaussianities: exact formulation and generic approximations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verde, Licia; Jimenez, Raul; Alvarez-Gaume, Luis

    2013-06-01

    We provide an exact expression for the multi-variate joint probability distribution function of non-Gaussian fields primordially arising from local transformations of a Gaussian field. This kind of non-Gaussianity is generated in many models of inflation. We apply our expression to the non-Gaussianity estimation from Cosmic Microwave Background maps and the halo mass function where we obtain analytical expressions. We also provide analytic approximations and their range of validity. For the Cosmic Microwave Background we give a fast way to compute the PDF which is valid up to more than 7σ for f{sub NL} values (both true and sampled) not ruledmore » out by current observations, which consists of expressing the PDF as a combination of bispectrum and trispectrum of the temperature maps. The resulting expression is valid for any kind of non-Gaussianity and is not limited to the local type. The above results may serve as the basis for a fully Bayesian analysis of the non-Gaussianity parameter.« less

  9. Bayesian inference on risk differences: an application to multivariate meta-analysis of adverse events in clinical trials.

    PubMed

    Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng

    2013-05-01

    Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.

  10. A multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration.

    PubMed

    Goovaerts, P; Albuquerque, Teresa; Antunes, Margarida

    2016-11-01

    This paper describes a multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration, with an application to an abandoned sedimentary gold mining region in Portugal. The main challenge was the existence of only a dozen gold measurements confined to the grounds of the old gold mines, which precluded the application of traditional interpolation techniques, such as cokriging. The analysis could, however, capitalize on 376 stream sediment samples that were analyzed for twenty two elements. Gold (Au) was first predicted at all 376 locations using linear regression (R 2 =0.798) and four metals (Fe, As, Sn and W), which are known to be mostly associated with the local gold's paragenesis. One hundred realizations of the spatial distribution of gold content were generated using sequential indicator simulation and a soft indicator coding of regression estimates, to supplement the hard indicator coding of gold measurements. Each simulated map then underwent a local cluster analysis to identify significant aggregates of low or high values. The one hundred classified maps were processed to derive the most likely classification of each simulated node and the associated probability of occurrence. Examining the distribution of the hot-spots and cold-spots reveals a clear enrichment in Au along the Erges River downstream from the old sedimentary mineralization.

  11. Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances.

    PubMed

    Sáez, Carlos; Robles, Montserrat; García-Gómez, Juan M

    2017-02-01

    Biomedical data may be composed of individuals generated from distinct, meaningful sources. Due to possible contextual biases in the processes that generate data, there may exist an undesirable and unexpected variability among the probability distribution functions (PDFs) of the source subsamples, which, when uncontrolled, may lead to inaccurate or unreproducible research results. Classical statistical methods may have difficulties to undercover such variabilities when dealing with multi-modal, multi-type, multi-variate data. This work proposes two metrics for the analysis of stability among multiple data sources, robust to the aforementioned conditions, and defined in the context of data quality assessment. Specifically, a global probabilistic deviation and a source probabilistic outlyingness metrics are proposed. The first provides a bounded degree of the global multi-source variability, designed as an estimator equivalent to the notion of normalized standard deviation of PDFs. The second provides a bounded degree of the dissimilarity of each source to a latent central distribution. The metrics are based on the projection of a simplex geometrical structure constructed from the Jensen-Shannon distances among the sources PDFs. The metrics have been evaluated and demonstrated their correct behaviour on a simulated benchmark and with real multi-source biomedical data using the UCI Heart Disease data set. The biomedical data quality assessment based on the proposed stability metrics may improve the efficiency and effectiveness of biomedical data exploitation and research.

  12. Distributions and fate of chlorinated pesticides, biomarkers and polycyclic aromatic hydrocarbons in sediments along a contamination gradient from a point-source in San Francisco Bay, California

    USGS Publications Warehouse

    Pereira, W.E.; Hostettler, F.D.; Rapp, J.B.

    1996-01-01

    The distribution and fate of chlorinated pesticides, biomarkers, and polycyclic aromatic hydrocarbons (PAHs) in surficial sediments along a contamination gradient in the Lauritzen Canal and Richmond Harbor in San Francisco Bay was investigated. Compounds were identified and quantified using gas chromatography-ion trap mass spectrometry. Biomarkers and PAHs were derived primarily from weathered petroleum. DDT was reductively dechlorinated under anoxic conditions to DDD and several minor degradation products, DDMU, DDMS, and DDNU. Under aerobic conditions, DDT was dehydrochlorinated to DDE and DBP. Aerobic degradation of DDT was diminished or inhibited in zones of high concentration, and increased significantly in zones of lower concentration: Other chlorinated pesticides identified in sediment included dieldrin and chlordane isomers. Multivariate analysis of the distributions of the DDTs suggested that there are probably two sources of DDD. In addition, DDE and DDMU are probably formed by similar mechanisms, i.e. dehydrochlorination. A steep concentration gradient existed from the Canal to the Outer Richmond Harbor, but higher levels of DDD than those found in the remainder of the Bay indicated that these contaminants are transported on particulates and colloidal organic matter from this source into San Francisco Bay. Chlorinated pesticides and PAHs may pose a potential problem to biota in San Francisco Bay.

  13. Probabilistic Modeling of the Renal Stone Formation Module

    NASA Technical Reports Server (NTRS)

    Best, Lauren M.; Myers, Jerry G.; Goodenow, Debra A.; McRae, Michael P.; Jackson, Travis C.

    2013-01-01

    The Integrated Medical Model (IMM) is a probabilistic tool, used in mission planning decision making and medical systems risk assessments. The IMM project maintains a database of over 80 medical conditions that could occur during a spaceflight, documenting an incidence rate and end case scenarios for each. In some cases, where observational data are insufficient to adequately define the inflight medical risk, the IMM utilizes external probabilistic modules to model and estimate the event likelihoods. One such medical event of interest is an unpassed renal stone. Due to a high salt diet and high concentrations of calcium in the blood (due to bone depletion caused by unloading in the microgravity environment) astronauts are at a considerable elevated risk for developing renal calculi (nephrolithiasis) while in space. Lack of observed incidences of nephrolithiasis has led HRP to initiate the development of the Renal Stone Formation Module (RSFM) to create a probabilistic simulator capable of estimating the likelihood of symptomatic renal stone presentation in astronauts on exploration missions. The model consists of two major parts. The first is the probabilistic component, which utilizes probability distributions to assess the range of urine electrolyte parameters and a multivariate regression to transform estimated crystal density and size distributions to the likelihood of the presentation of nephrolithiasis symptoms. The second is a deterministic physical and chemical model of renal stone growth in the kidney developed by Kassemi et al. The probabilistic component of the renal stone model couples the input probability distributions describing the urine chemistry, astronaut physiology, and system parameters with the physical and chemical outputs and inputs to the deterministic stone growth model. These two parts of the model are necessary to capture the uncertainty in the likelihood estimate. The model will be driven by Monte Carlo simulations, continuously randomly sampling the probability distributions of the electrolyte concentrations and system parameters that are inputs into the deterministic model. The total urine chemistry concentrations are used to determine the urine chemistry activity using the Joint Expert Speciation System (JESS), a biochemistry model. Information used from JESS is then fed into the deterministic growth model. Outputs from JESS and the deterministic model are passed back to the probabilistic model where a multivariate regression is used to assess the likelihood of a stone forming and the likelihood of a stone requiring clinical intervention. The parameters used to determine to quantify these risks include: relative supersaturation (RS) of calcium oxalate, citrate/calcium ratio, crystal number density, total urine volume, pH, magnesium excretion, maximum stone width, and ureteral location. Methods and Validation: The RSFM is designed to perform a Monte Carlo simulation to generate probability distributions of clinically significant renal stones, as well as provide an associated uncertainty in the estimate. Initially, early versions will be used to test integration of the components and assess component validation and verification (V&V), with later versions used to address questions regarding design reference mission scenarios. Once integrated with the deterministic component, the credibility assessment of the integrated model will follow NASA STD 7009 requirements.

  14. Developing a Hierarchical Model for the Spatial Analysis of PM10 Pollution Extremes in the Mexico City Metropolitan Area.

    PubMed

    Aguirre-Salado, Alejandro Ivan; Vaquera-Huerta, Humberto; Aguirre-Salado, Carlos Arturo; Reyes-Mora, Silvia; Olvera-Cervantes, Ana Delia; Lancho-Romero, Guillermo Arturo; Soubervielle-Montalvo, Carlos

    2017-07-06

    We implemented a spatial model for analysing PM 10 maxima across the Mexico City metropolitan area during the period 1995-2016. We assumed that these maxima follow a non-identical generalized extreme value (GEV) distribution and modeled the trend by introducing multivariate smoothing spline functions into the probability GEV distribution. A flexible, three-stage hierarchical Bayesian approach was developed to analyse the distribution of the PM 10 maxima in space and time. We evaluated the statistical model's performance by using a simulation study. The results showed strong evidence of a positive correlation between the PM 10 maxima and the longitude and latitude. The relationship between time and the PM 10 maxima was negative, indicating a decreasing trend over time. Finally, a high risk of PM 10 maxima presenting levels above 1000 μ g/m 3 (return period: 25 yr) was observed in the northwestern region of the study area.

  15. Developing a Hierarchical Model for the Spatial Analysis of PM10 Pollution Extremes in the Mexico City Metropolitan Area

    PubMed Central

    Aguirre-Salado, Alejandro Ivan; Vaquera-Huerta, Humberto; Aguirre-Salado, Carlos Arturo; Reyes-Mora, Silvia; Olvera-Cervantes, Ana Delia; Lancho-Romero, Guillermo Arturo; Soubervielle-Montalvo, Carlos

    2017-01-01

    We implemented a spatial model for analysing PM10 maxima across the Mexico City metropolitan area during the period 1995–2016. We assumed that these maxima follow a non-identical generalized extreme value (GEV) distribution and modeled the trend by introducing multivariate smoothing spline functions into the probability GEV distribution. A flexible, three-stage hierarchical Bayesian approach was developed to analyse the distribution of the PM10 maxima in space and time. We evaluated the statistical model’s performance by using a simulation study. The results showed strong evidence of a positive correlation between the PM10 maxima and the longitude and latitude. The relationship between time and the PM10 maxima was negative, indicating a decreasing trend over time. Finally, a high risk of PM10 maxima presenting levels above 1000 μg/m3 (return period: 25 yr) was observed in the northwestern region of the study area. PMID:28684720

  16. A stochastic storm surge generator for the German North Sea and the multivariate statistical assessment of the simulation results

    NASA Astrophysics Data System (ADS)

    Wahl, Thomas; Jensen, Jürgen; Mudersbach, Christoph

    2010-05-01

    Storm surges along the German North Sea coastline led to major damages in the past and the risk of inundation is expected to increase in the course of an ongoing climate change. The knowledge of the characteristics of possible storm surges is essential for the performance of integrated risk analyses, e.g. based on the source-pathway-receptor concept. The latter includes the storm surge simulation/analyses (source), modelling of dike/dune breach scenarios (pathway) and the quantification of potential losses (receptor). In subproject 1b of the German joint research project XtremRisK (www.xtremrisk.de), a stochastic storm surge generator for the south-eastern North Sea area is developed. The input data for the multivariate model are high resolution sea level observations from tide gauges during extreme events. Based on 25 parameters (19 sea level parameters and 6 time parameters) observed storm surge hydrographs consisting of three tides are parameterised. Followed by the adaption of common parametric probability distributions and a large number of Monte-Carlo-Simulations, the final reconstruction leads to a set of 100.000 (default) synthetic storm surge events with a one-minute resolution. Such a data set can potentially serve as the basis for a large number of applications. For risk analyses, storm surges with peak water levels exceeding the design water levels are of special interest. The occurrence probabilities of the simulated extreme events are estimated based on multivariate statistics, considering the parameters "peak water level" and "fullness/intensity". In the past, most studies considered only the peak water levels during extreme events, which might not be the most important parameter in any cases. Here, a 2D-Archimedian copula model is used for the estimation of the joint probabilities of the selected parameters, accounting for the structures of dependence overlooking the margins. In coordination with subproject 1a, the results will be used as the input for the XtremRisK subprojects 2 to 4. The project is funded by the German Federal Ministry of Education and Research (BMBF) (Project No. 03 F 0483 B).

  17. A Nonlinear Framework of Delayed Particle Smoothing Method for Vehicle Localization under Non-Gaussian Environment.

    PubMed

    Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong

    2016-05-13

    In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student's t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods.

  18. Water shortage risk assessment considering large-scale regional transfers: a copula-based uncertainty case study in Lunan, China.

    PubMed

    Gao, Xueping; Liu, Yinzhu; Sun, Bowen

    2018-06-05

    The risk of water shortage caused by uncertainties, such as frequent drought, varied precipitation, multiple water resources, and different water demands, brings new challenges to the water transfer projects. Uncertainties exist for transferring water and local surface water; therefore, the relationship between them should be thoroughly studied to prevent water shortage. For more effective water management, an uncertainty-based water shortage risk assessment model (UWSRAM) is developed to study the combined effect of multiple water resources and analyze the shortage degree under uncertainty. The UWSRAM combines copula-based Monte Carlo stochastic simulation and the chance-constrained programming-stochastic multiobjective optimization model, using the Lunan water-receiving area in China as an example. Statistical copula functions are employed to estimate the joint probability of available transferring water and local surface water and sampling from the multivariate probability distribution, which are used as inputs for the optimization model. The approach reveals the distribution of water shortage and is able to emphasize the importance of improving and updating transferring water and local surface water management, and examine their combined influence on water shortage risk assessment. The possible available water and shortages can be calculated applying the UWSRAM, also with the corresponding allocation measures under different water availability levels and violating probabilities. The UWSRAM is valuable for mastering the overall multi-water resource and water shortage degree, adapting to the uncertainty surrounding water resources, establishing effective water resource planning policies for managers and achieving sustainable development.

  19. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    NASA Astrophysics Data System (ADS)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  20. Eigenvalue statistics for the sum of two complex Wishart matrices

    NASA Astrophysics Data System (ADS)

    Kumar, Santosh

    2014-09-01

    The sum of independent Wishart matrices, taken from distributions with unequal covariance matrices, plays a crucial role in multivariate statistics, and has applications in the fields of quantitative finance and telecommunication. However, analytical results concerning the corresponding eigenvalue statistics have remained unavailable, even for the sum of two Wishart matrices. This can be attributed to the complicated and rotationally noninvariant nature of the matrix distribution that makes extracting the information about eigenvalues a nontrivial task. Using a generalization of the Harish-Chandra-Itzykson-Zuber integral, we find exact solution to this problem for the complex Wishart case when one of the covariance matrices is proportional to the identity matrix, while the other is arbitrary. We derive exact and compact expressions for the joint probability density and marginal density of eigenvalues. The analytical results are compared with numerical simulations and we find perfect agreement.

  1. A comparison of portfolio selection models via application on ISE 100 index data

    NASA Astrophysics Data System (ADS)

    Altun, Emrah; Tatlidil, Hüseyin

    2013-10-01

    Markowitz Model, a classical approach to portfolio optimization problem, relies on two important assumptions: the expected return is multivariate normally distributed and the investor is risk averter. But this model has not been extensively used in finance. Empirical results show that it is very hard to solve large scale portfolio optimization problems with Mean-Variance (M-V)model. Alternative model, Mean Absolute Deviation (MAD) model which is proposed by Konno and Yamazaki [7] has been used to remove most of difficulties of Markowitz Mean-Variance model. MAD model don't need to assume that the probability of the rates of return is normally distributed and based on Linear Programming. Another alternative portfolio model is Mean-Lower Semi Absolute Deviation (M-LSAD), which is proposed by Speranza [3]. We will compare these models to determine which model gives more appropriate solution to investors.

  2. Multivariate stochastic simulation with subjective multivariate normal distributions

    Treesearch

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  3. Usual Dietary Intakes: SAS Macros for Fitting Multivariate Measurement Error Models & Estimating Multivariate Usual Intake Distributions

    Cancer.gov

    The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.

  4. A Compendium of Wind Statistics and Models for the NASA Space Shuttle and Other Aerospace Vehicle Programs

    NASA Technical Reports Server (NTRS)

    Smith, O. E.; Adelfang, S. I.

    1998-01-01

    The wind profile with all of its variations with respect to altitude has been, is now, and will continue to be important for aerospace vehicle design and operations. Wind profile databases and models are used for the vehicle ascent flight design for structural wind loading, flight control systems, performance analysis, and launch operations. This report presents the evolution of wind statistics and wind models from the empirical scalar wind profile model established for the Saturn Program through the development of the vector wind profile model used for the Space Shuttle design to the variations of this wind modeling concept for the X-33 program. Because wind is a vector quantity, the vector wind models use the rigorous mathematical probability properties of the multivariate normal probability distribution. When the vehicle ascent steering commands (ascent guidance) are wind biased to the wind profile measured on the day-of-launch, ascent structural wind loads are reduced and launch probability is increased. This wind load alleviation technique is recommended in the initial phase of vehicle development. The vehicle must fly through the largest load allowable versus altitude to achieve its mission. The Gumbel extreme value probability distribution is used to obtain the probability of exceeding (or not exceeding) the load allowable. The time conditional probability function is derived from the Gumbel bivariate extreme value distribution. This time conditional function is used for calculation of wind loads persistence increments using 3.5-hour Jimsphere wind pairs. These increments are used to protect the commit-to-launch decision. Other topics presented include the Shuttle Shuttle load-response to smoothed wind profiles, a new gust model, and advancements in wind profile measuring systems. From the lessons learned and knowledge gained from past vehicle programs, the development of future launch vehicles can be accelerated. However, new vehicle programs by their very nature will require specialized support for new databases and analyses for wind, atmospheric parameters (pressure, temperature, and density versus altitude), and weather. It is for this reason that project managers are encouraged to collaborate with natural environment specialists early in the conceptual design phase. Such action will give the lead time necessary to meet the natural environment design and operational requirements, and thus, reduce development costs.

  5. Back to Normal! Gaussianizing posterior distributions for cosmological probes

    NASA Astrophysics Data System (ADS)

    Schuhmann, Robert L.; Joachimi, Benjamin; Peiris, Hiranya V.

    2014-05-01

    We present a method to map multivariate non-Gaussian posterior probability densities into Gaussian ones via nonlinear Box-Cox transformations, and generalizations thereof. This is analogous to the search for normal parameters in the CMB, but can in principle be applied to any probability density that is continuous and unimodal. The search for the optimally Gaussianizing transformation amongst the Box-Cox family is performed via a maximum likelihood formalism. We can judge the quality of the found transformation a posteriori: qualitatively via statistical tests of Gaussianity, and more illustratively by how well it reproduces the credible regions. The method permits an analytical reconstruction of the posterior from a sample, e.g. a Markov chain, and simplifies the subsequent joint analysis with other experiments. Furthermore, it permits the characterization of a non-Gaussian posterior in a compact and efficient way. The expression for the non-Gaussian posterior can be employed to find analytic formulae for the Bayesian evidence, and consequently be used for model comparison.

  6. Prediction of Malaysian monthly GDP

    NASA Astrophysics Data System (ADS)

    Hin, Pooi Ah; Ching, Soo Huei; Yeing, Pan Wei

    2015-12-01

    The paper attempts to use a method based on multivariate power-normal distribution to predict the Malaysian Gross Domestic Product next month. Letting r(t) be the vector consisting of the month-t values on m selected macroeconomic variables, and GDP, we model the month-(t+1) GDP to be dependent on the present and l-1 past values r(t), r(t-1),…,r(t-l+1) via a conditional distribution which is derived from a [(m+1)l+1]-dimensional power-normal distribution. The 100(α/2)% and 100(1-α/2)% points of the conditional distribution may be used to form an out-of sample prediction interval. This interval together with the mean of the conditional distribution may be used to predict the month-(t+1) GDP. The mean absolute percentage error (MAPE), estimated coverage probability and average length of the prediction interval are used as the criterions for selecting the suitable lag value l-1 and the subset from a pool of 17 macroeconomic variables. It is found that the relatively better models would be those of which 2 ≤ l ≤ 3, and involving one or two of the macroeconomic variables given by Market Indicative Yield, Oil Prices, Exchange Rate and Import Trade.

  7. Statistical analysis of multivariate atmospheric variables. [cloud cover

    NASA Technical Reports Server (NTRS)

    Tubbs, J. D.

    1979-01-01

    Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.

  8. Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model.

    PubMed

    Snell, Kym I E; Hua, Harry; Debray, Thomas P A; Ensor, Joie; Look, Maxime P; Moons, Karel G M; Riley, Richard D

    2016-01-01

    Our aim was to improve meta-analysis methods for summarizing a prediction model's performance when individual participant data are available from multiple studies for external validation. We suggest multivariate meta-analysis for jointly synthesizing calibration and discrimination performance, while accounting for their correlation. The approach estimates a prediction model's average performance, the heterogeneity in performance across populations, and the probability of "good" performance in new populations. This allows different implementation strategies (e.g., recalibration) to be compared. Application is made to a diagnostic model for deep vein thrombosis (DVT) and a prognostic model for breast cancer mortality. In both examples, multivariate meta-analysis reveals that calibration performance is excellent on average but highly heterogeneous across populations unless the model's intercept (baseline hazard) is recalibrated. For the cancer model, the probability of "good" performance (defined by C statistic ≥0.7 and calibration slope between 0.9 and 1.1) in a new population was 0.67 with recalibration but 0.22 without recalibration. For the DVT model, even with recalibration, there was only a 0.03 probability of "good" performance. Multivariate meta-analysis can be used to externally validate a prediction model's calibration and discrimination performance across multiple populations and to evaluate different implementation strategies. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  9. Detangling complex relationships in forensic data: principles and use of causal networks and their application to clinical forensic science.

    PubMed

    Lefèvre, Thomas; Lepresle, Aude; Chariot, Patrick

    2015-09-01

    The search for complex, nonlinear relationships and causality in data is hindered by the availability of techniques in many domains, including forensic science. Linear multivariable techniques are useful but present some shortcomings. In the past decade, Bayesian approaches have been introduced in forensic science. To date, authors have mainly focused on providing an alternative to classical techniques for quantifying effects and dealing with uncertainty. Causal networks, including Bayesian networks, can help detangle complex relationships in data. A Bayesian network estimates the joint probability distribution of data and graphically displays dependencies between variables and the circulation of information between these variables. In this study, we illustrate the interest in utilizing Bayesian networks for dealing with complex data through an application in clinical forensic science. Evaluating the functional impairment of assault survivors is a complex task for which few determinants are known. As routinely estimated in France, the duration of this impairment can be quantified by days of 'Total Incapacity to Work' ('Incapacité totale de travail,' ITT). In this study, we used a Bayesian network approach to identify the injury type, victim category and time to evaluation as the main determinants of the 'Total Incapacity to Work' (TIW). We computed the conditional probabilities associated with the TIW node and its parents. We compared this approach with a multivariable analysis, and the results of both techniques were converging. Thus, Bayesian networks should be considered a reliable means to detangle complex relationships in data.

  10. On the Theory of Multivariate Elliptically Contoured Distributions and Their Applications.

    DTIC Science & Technology

    1982-05-01

    elliptically contoured distributions has been studied by several authors: Schoenberg (1938), Kelker (1970), Devlin, Gnanadesikan and Keltenring (1976...theory of ellip- tically contoured distributions, J. Multivariate Analysis, 11, 368-385. Devlin, S. J., Gnanadesikan , R., and Kettenring, J. R. (1976

  11. Stochastic static fault slip inversion from geodetic data with non-negativity and bounds constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-04-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems (Tarantola & Valette 1982; Tarantola 2005) provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modeling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a Truncated Multi-Variate Normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulas for the single, two-dimensional or n-dimensional marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations (e.g. Genz & Bretz 2009). Posterior mean and covariance can also be efficiently derived. I show that the Maximum Posterior (MAP) can be obtained using a Non-Negative Least-Squares algorithm (Lawson & Hanson 1974) for the single truncated case or using the Bounded-Variable Least-Squares algorithm (Stark & Parker 1995) for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov Chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modeling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the Maximum Posterior (MAP) is extremely fast.

  12. A multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration

    PubMed Central

    Goovaerts, P.; Albuquerque, Teresa; Antunes, Margarida

    2015-01-01

    This paper describes a multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration, with an application to an abandoned sedimentary gold mining region in Portugal. The main challenge was the existence of only a dozen gold measurements confined to the grounds of the old gold mines, which precluded the application of traditional interpolation techniques, such as cokriging. The analysis could, however, capitalize on 376 stream sediment samples that were analyzed for twenty two elements. Gold (Au) was first predicted at all 376 locations using linear regression (R2=0.798) and four metals (Fe, As, Sn and W), which are known to be mostly associated with the local gold’s paragenesis. One hundred realizations of the spatial distribution of gold content were generated using sequential indicator simulation and a soft indicator coding of regression estimates, to supplement the hard indicator coding of gold measurements. Each simulated map then underwent a local cluster analysis to identify significant aggregates of low or high values. The one hundred classified maps were processed to derive the most likely classification of each simulated node and the associated probability of occurrence. Examining the distribution of the hot-spots and cold-spots reveals a clear enrichment in Au along the Erges River downstream from the old sedimentary mineralization. PMID:27777638

  13. Sensor failure and multivariable control for airbreathing propulsion systems. Ph.D. Thesis - Dec. 1979 Final Report

    NASA Technical Reports Server (NTRS)

    Behbehani, K.

    1980-01-01

    A new sensor/actuator failure analysis technique for turbofan jet engines was developed. Three phases of failure analysis, namely detection, isolation, and accommodation are considered. Failure detection and isolation techniques are developed by utilizing the concept of Generalized Likelihood Ratio (GLR) tests. These techniques are applicable to both time varying and time invariant systems. Three GLR detectors are developed for: (1) hard-over sensor failure; (2) hard-over actuator failure; and (3) brief disturbances in the actuators. The probability distribution of the GLR detectors and the detectability of sensor/actuator failures are established. Failure type is determined by the maximum of the GLR detectors. Failure accommodation is accomplished by extending the Multivariable Nyquest Array (MNA) control design techniques to nonsquare system designs. The performance and effectiveness of the failure analysis technique are studied by applying the technique to a turbofan jet engine, namely the Quiet Clean Short Haul Experimental Engine (QCSEE). Single and multiple sensor/actuator failures in the QCSEE are simulated and analyzed and the effects of model degradation are studied.

  14. Buried landmine detection using multivariate normal clustering

    NASA Astrophysics Data System (ADS)

    Duston, Brian M.

    2001-10-01

    A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.

  15. Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation.

    PubMed

    Cain, Meghan K; Zhang, Zhiyong; Yuan, Ke-Hai

    2017-10-01

    Nonnormality of univariate data has been extensively examined previously (Blanca et al., Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78-84, 2013; Miceeri, Psychological Bulletin, 105(1), 156, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74 % of univariate distributions and 68 % multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17 % in a t-test and 30 % in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances. To facilitate future report of skewness and kurtosis, we provide a tutorial on how to compute univariate and multivariate skewness and kurtosis by SAS, SPSS, R and a newly developed Web application.

  16. Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

    PubMed

    Shen, Yanna; Cooper, Gregory F

    2012-09-01

    This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  17. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  18. A multivariate copula-based framework for dealing with hazard scenarios and failure probabilities

    NASA Astrophysics Data System (ADS)

    Salvadori, G.; Durante, F.; De Michele, C.; Bernardi, M.; Petrella, L.

    2016-05-01

    This paper is of methodological nature, and deals with the foundations of Risk Assessment. Several international guidelines have recently recommended to select appropriate/relevant Hazard Scenarios in order to tame the consequences of (extreme) natural phenomena. In particular, the scenarios should be multivariate, i.e., they should take into account the fact that several variables, generally not independent, may be of interest. In this work, it is shown how a Hazard Scenario can be identified in terms of (i) a specific geometry and (ii) a suitable probability level. Several scenarios, as well as a Structural approach, are presented, and due comparisons are carried out. In addition, it is shown how the Hazard Scenario approach illustrated here is well suited to cope with the notion of Failure Probability, a tool traditionally used for design and risk assessment in engineering practice. All the results outlined throughout the work are based on the Copula Theory, which turns out to be a fundamental theoretical apparatus for doing multivariate risk assessment: formulas for the calculation of the probability of Hazard Scenarios in the general multidimensional case (d≥2) are derived, and worthy analytical relationships among the probabilities of occurrence of Hazard Scenarios are presented. In addition, the Extreme Value and Archimedean special cases are dealt with, relationships between dependence ordering and scenario levels are studied, and a counter-example concerning Tail Dependence is shown. Suitable indications for the practical application of the techniques outlined in the work are given, and two case studies illustrate the procedures discussed in the paper.

  19. A Mulitivariate Statistical Model Describing the Compound Nature of Soil Moisture Drought

    NASA Astrophysics Data System (ADS)

    Manning, Colin; Widmann, Martin; Bevacqua, Emanuele; Maraun, Douglas; Van Loon, Anne; Vrac, Mathieu

    2017-04-01

    Soil moisture in Europe acts to partition incoming energy into sensible and latent heat fluxes, thereby exerting a large influence on temperature variability. Soil moisture is predominantly controlled by precipitation and evapotranspiration. When these meteorological variables are accumulated over different timescales, their joint multivariate distribution and dependence structure can be used to provide information of soil moisture. We therefore consider soil moisture drought as a compound event of meteorological drought (deficits of precipitation) and heat waves, or more specifically, periods of high Potential Evapotraspiration (PET). We present here a statistical model of soil moisture based on Pair Copula Constructions (PCC) that can describe the dependence amongst soil moisture and its contributing meteorological variables. The model is designed in such a way that it can account for concurrences of meteorological drought and heat waves and describe the dependence between these conditions at a local level. The model is composed of four variables; daily soil moisture (h); a short term and a long term accumulated precipitation variable (Y1 and Y_2) that account for the propagation of meteorological drought to soil moisture drought; and accumulated PET (Y_3), calculated using the Penman Monteith equation, which can represent the effect of a heat wave on soil conditions. Copula are multivariate distribution functions that allow one to model the dependence structure of given variables separately from their marginal behaviour. PCCs then allow in theory for the formulation of a multivariate distribution of any dimension where the multivariate distribution is decomposed into a product of marginal probability density functions and two-dimensional copula, of which some are conditional. We apply PCC here in such a way that allows us to provide estimates of h and their uncertainty through conditioning on the Y in the form h=h|y_1,y_2,y_3 (1) Applying the model to various Fluxnet sites across Europe, we find the model has good skill and can particularly capture periods of low soil moisture well. We illustrate the relevance of the dependence structure of these Y variables to soil moisture and show how it may be generalised to offer information of soil moisture on a widespread scale where few observations of soil moisture exist. We then present results from a validation study of a selection of EURO CORDEX climate models where we demonstrate the skill of these models in representing these dependencies and so offer insight into the skill seen in the representation of soil moisture in these models.

  20. Using Copula Distributions to Support More Accurate Imaging-Based Diagnostic Classifiers for Neuropsychiatric Disorders

    PubMed Central

    Bansal, Ravi; Hao, Xuejun; Liu, Jun; Peterson, Bradley S.

    2014-01-01

    Many investigators have tried to apply machine learning techniques to magnetic resonance images (MRIs) of the brain in order to diagnose neuropsychiatric disorders. Usually the number of brain imaging measures (such as measures of cortical thickness and measures of local surface morphology) derived from the MRIs (i.e., their dimensionality) has been large (e.g. >10) relative to the number of participants who provide the MRI data (<100). Sparse data in a high dimensional space increases the variability of the classification rules that machine learning algorithms generate, thereby limiting the validity, reproducibility, and generalizability of those classifiers. The accuracy and stability of the classifiers can improve significantly if the multivariate distributions of the imaging measures can be estimated accurately. To accurately estimate the multivariate distributions using sparse data, we propose to estimate first the univariate distributions of imaging data and then combine them using a Copula to generate more accurate estimates of their multivariate distributions. We then sample the estimated Copula distributions to generate dense sets of imaging measures and use those measures to train classifiers. We hypothesize that the dense sets of brain imaging measures will generate classifiers that are stable to variations in brain imaging measures, thereby improving the reproducibility, validity, and generalizability of diagnostic classification algorithms in imaging datasets from clinical populations. In our experiments, we used both computer-generated and real-world brain imaging datasets to assess the accuracy of multivariate Copula distributions in estimating the corresponding multivariate distributions of real-world imaging data. Our experiments showed that diagnostic classifiers generated using imaging measures sampled from the Copula were significantly more accurate and more reproducible than were the classifiers generated using either the real-world imaging measures or their multivariate Gaussian distributions. Thus, our findings demonstrate that estimated multivariate Copula distributions can generate dense sets of brain imaging measures that can in turn be used to train classifiers, and those classifiers are significantly more accurate and more reproducible than are those generated using real-world imaging measures alone. PMID:25093634

  1. A preliminary analysis of quantifying computer security vulnerability data in "the wild"

    NASA Astrophysics Data System (ADS)

    Farris, Katheryn A.; McNamara, Sean R.; Goldstein, Adam; Cybenko, George

    2016-05-01

    A system of computers, networks and software has some level of vulnerability exposure that puts it at risk to criminal hackers. Presently, most vulnerability research uses data from software vendors, and the National Vulnerability Database (NVD). We propose an alternative path forward through grounding our analysis in data from the operational information security community, i.e. vulnerability data from "the wild". In this paper, we propose a vulnerability data parsing algorithm and an in-depth univariate and multivariate analysis of the vulnerability arrival and deletion process (also referred to as the vulnerability birth-death process). We find that vulnerability arrivals are best characterized by the log-normal distribution and vulnerability deletions are best characterized by the exponential distribution. These distributions can serve as prior probabilities for future Bayesian analysis. We also find that over 22% of the deleted vulnerability data have a rate of zero, and that the arrival vulnerability data is always greater than zero. Finally, we quantify and visualize the dependencies between vulnerability arrivals and deletions through a bivariate scatterplot and statistical observations.

  2. Monograph on the use of the multivariate Gram Charlier series Type A

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hatayodom, T.; Heydt, G.

    1978-01-01

    The Gram-Charlier series in an infinite series expansion for a probability density function (pdf) in which terms of the series are Hermite polynomials. There are several Gram-Charlier series - the best known is Type A. The Gram-Charlier series, Type A (GCA) exists for both univariate and multivariate random variables. This monograph introduces the multivariate GCA and illustrates its use through several examples. A brief bibliography and discussion of Hermite polynomials is also included. 9 figures, 2 tables.

  3. Non-Gaussian probabilistic MEG source localisation based on kernel density estimation☆

    PubMed Central

    Mohseni, Hamid R.; Kringelbach, Morten L.; Woolrich, Mark W.; Baker, Adam; Aziz, Tipu Z.; Probert-Smith, Penny

    2014-01-01

    There is strong evidence to suggest that data recorded from magnetoencephalography (MEG) follows a non-Gaussian distribution. However, existing standard methods for source localisation model the data using only second order statistics, and therefore use the inherent assumption of a Gaussian distribution. In this paper, we present a new general method for non-Gaussian source estimation of stationary signals for localising brain activity from MEG data. By providing a Bayesian formulation for MEG source localisation, we show that the source probability density function (pdf), which is not necessarily Gaussian, can be estimated using multivariate kernel density estimators. In the case of Gaussian data, the solution of the method is equivalent to that of widely used linearly constrained minimum variance (LCMV) beamformer. The method is also extended to handle data with highly correlated sources using the marginal distribution of the estimated joint distribution, which, in the case of Gaussian measurements, corresponds to the null-beamformer. The proposed non-Gaussian source localisation approach is shown to give better spatial estimates than the LCMV beamformer, both in simulations incorporating non-Gaussian signals, and in real MEG measurements of auditory and visual evoked responses, where the highly correlated sources are known to be difficult to estimate. PMID:24055702

  4. A Nonlinear Framework of Delayed Particle Smoothing Method for Vehicle Localization under Non-Gaussian Environment

    PubMed Central

    Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong

    2016-01-01

    In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student’s t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods. PMID:27187405

  5. Stochastic coalescence in finite systems: an algorithm for the numerical solution of the multivariate master equation.

    NASA Astrophysics Data System (ADS)

    Alfonso, Lester; Zamora, Jose; Cruz, Pedro

    2015-04-01

    The stochastic approach to coagulation considers the coalescence process going in a system of a finite number of particles enclosed in a finite volume. Within this approach, the full description of the system can be obtained from the solution of the multivariate master equation, which models the evolution of the probability distribution of the state vector for the number of particles of a given mass. Unfortunately, due to its complexity, only limited results were obtained for certain type of kernels and monodisperse initial conditions. In this work, a novel numerical algorithm for the solution of the multivariate master equation for stochastic coalescence that works for any type of kernels and initial conditions is introduced. The performance of the method was checked by comparing the numerically calculated particle mass spectrum with analytical solutions obtained for the constant and sum kernels, with an excellent correspondence between the analytical and numerical solutions. In order to increase the speedup of the algorithm, software parallelization techniques with OpenMP standard were used, along with an implementation in order to take advantage of new accelerator technologies. Simulations results show an important speedup of the parallelized algorithms. This study was funded by a grant from Consejo Nacional de Ciencia y Tecnologia de Mexico SEP-CONACYT CB-131879. The authors also thanks LUFAC® Computacion SA de CV for CPU time and all the support provided.

  6. HLA-matched sibling bone marrow transplantation for β-thalassemia major

    PubMed Central

    Sabloff, Mitchell; Chandy, Mammen; Wang, Zhiwei; Logan, Brent R.; Ghavamzadeh, Ardeshir; Li, Chi-Kong; Irfan, Syed Mohammad; Bredeson, Christopher N.; Cowan, Morton J.; Gale, Robert Peter; Hale, Gregory A.; Horan, John; Hongeng, Suradej; Eapen, Mary

    2011-01-01

    We describe outcomes after human leukocyte antigen-matched sibling bone marrow transplantation (BMT) for 179 patients with β-thalassemia major. The median age at transplantation was 7 years and the median follow-up was 6 years. The distribution of Pesaro risk class I, II, and III categories was 2%, 42%, and 36%, respectively. The day 30 cumulative incidence of neutrophil recovery and day 100 platelet recovery were 90% and 86%, respectively. Seventeen patients had graft failure, which was fatal in 11. Six of 9 patients with graft failure are alive after a second transplantation. The day 100 probability of acute graft-versus-host disease and 5-year probability of chronic graft-versus-host disease was 38% and 13%, respectively. The 5-year probabilities of overall- and disease-free survival were 91% and 88%, respectively, for patients with Pesaro risk class II, and 64% and 62%, respectively, for Pesaro risk class III. In multivariate analysis, mortality risks were higher in patients 7 years of age and older and those with hepatomegaly before BMT. The leading causes of death were interstitial pneumonitis (n = 7), hemorrhage (n = 8), and veno-occlusive disease (n = 6). Proceeding to BMT in children younger than 7 years before development of end-organ damage, particularly in the liver, should improve results after BMT for β-thalassemia major. PMID:21119108

  7. Impact of prostate weight on probability of positive surgical margins in patients with low-risk prostate cancer after robotic-assisted laparoscopic radical prostatectomy.

    PubMed

    Marchetti, Pablo E; Shikanov, Sergey; Razmaria, Aria A; Zagaja, Gregory P; Shalhav, Arieh L

    2011-03-01

    To evaluate the impact of prostate weight (PW) on probability of positive surgical margin (PSM) in patients undergoing robotic-assisted radical prostatectomy (RARP) for low-risk prostate cancer. The cohort consisted of 690 men with low-risk prostate cancer (clinical stage T1c, prostate-specific antigen <10 ng/mL, biopsy Gleason score ≤6) who underwent RARP with bilateral nerve-sparing at our institution by 1 of 2 surgeons from 2003 to 2009. PW was obtained from the pathologic specimen. The association between probability of PSM and PW was assessed with univariate and multivariate logistic regression analysis. A PSM was identified in 105 patients (15.2%). Patients with PSM had significant higher prostate-specific antigen (P = .04), smaller prostates (P = .0001), higher Gleason score (P = .004), and higher pathologic stage (P < .0001). After logistic regression, we found a significant inverse relation between PSM and PW (OR 0.97%; 95% confidence interval [CI] 0.96, 0.99; P = .0003) in univariate analysis. This remained significant in the multivariate model (OR 0.98%; 95% CI 0.96, 0.99; P = .006) adjusting for age, body mass index, surgeon experience, pathologic Gleason score, and pathologic stage. In this multivariate model, the predicted probability of PSM for 25-, 50-, 100-, and 150-g prostates were 22% (95% CI 16%, 30%), 13% (95% CI 11%, 16%), 5% (95% CI 1%, 8%), and 1% (95% CI 0%, 3%), respectively. Lower PW is independently associated with higher probability of PSM in low-risk patients undergoing RARP with bilateral nerve-sparing. Copyright © 2011 Elsevier Inc. All rights reserved.

  8. Probability of detecting atrazine/desethyl-atrazine and elevated concentrations of nitrate in ground water in Colorado

    USGS Publications Warehouse

    Rupert, Michael G.

    2003-01-01

    Draft Federal regulations may require that each State develop a State Pesticide Management Plan for the herbicides atrazine, alachlor, metolachlor, and simazine. Maps were developed that the State of Colorado could use to predict the probability of detecting atrazine and desethyl-atrazine (a breakdown product of atrazine) in ground water in Colorado. These maps can be incorporated into the State Pesticide Management Plan and can help provide a sound hydrogeologic basis for atrazine management in Colorado. Maps showing the probability of detecting elevated nitrite plus nitrate as nitrogen (nitrate) concentrations in ground water in Colorado also were developed because nitrate is a contaminant of concern in many areas of Colorado. Maps showing the probability of detecting atrazine and(or) desethyl-atrazine (atrazine/DEA) at or greater than concentrations of 0.1 microgram per liter and nitrate concentrations in ground water greater than 5 milligrams per liter were developed as follows: (1) Ground-water quality data were overlaid with anthropogenic and hydrogeologic data using a geographic information system to produce a data set in which each well had corresponding data on atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well construction. These data then were downloaded to a statistical software package for analysis by logistic regression. (2) Relations were observed between ground-water quality and the percentage of land-cover categories within circular regions (buffers) around wells. Several buffer sizes were evaluated; the buffer size that provided the strongest relation was selected for use in the logistic regression models. (3) Relations between concentrations of atrazine/DEA and nitrate in ground water and atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well-construction data were evaluated, and several preliminary multivariate models with various combinations of independent variables were constructed. (4) The multivariate models that best predicted the presence of atrazine/DEA and elevated concentrations of nitrate in ground water were selected. (5) The accuracy of the multivariate models was confirmed by validating the models with an independent set of ground-water quality data. (6) The multivariate models were entered into a geographic information system and the probability maps were constructed.

  9. [Catastrophic health expenditures in Mexico: magnitude, distribution and determinants].

    PubMed

    Sesma-Vázquez, Sergio; Pérez-Rico, Raymundo; Sosa-Manzano, Carlos Lino; Gómez-Dantés, Octavio

    2005-01-01

    To describe the magnitude, distribution, and determinants of catastrophic health expenditures in Mexico. The information source was the National Performance Assessment Survey and the methodology, the one developed by the World Health Organization for assessing fair financing. Households with catastrophic expenditures were defined as those with health expenditures over 30% of their ability to pay. Multivariate analysis by logistic and linear regression were used to identify the determinants of catastrophic expenditures. A total of 3.8% of the households incurred in catastrophic health expenditures. There were huge differences by state. The uninsured, poor, and rural households showed a higher impoverishment risk. Sixty percent of the catastrophic expenditures were attributable to outpatient care and medication. A 10% increase of insured households could result in a 9.6% decrease in catastrophic expenditures. Disability, adults 60 years of age and older, and pregnancy increased the probability of catastrophic expenditures. The insurance of older adults, pregnant women, and persons with disabilities could reduce catastrophic health expenditures in Mexico.

  10. An Efficient Numerical Approach for Nonlinear Fokker-Planck equations

    NASA Astrophysics Data System (ADS)

    Otten, Dustin; Vedula, Prakash

    2009-03-01

    Fokker-Planck equations which are nonlinear with respect to their probability densities that occur in many nonequilibrium systems relevant to mean field interaction models, plasmas, classical fermions and bosons can be challenging to solve numerically. To address some underlying challenges in obtaining numerical solutions, we propose a quadrature based moment method for efficient and accurate determination of transient (and stationary) solutions of nonlinear Fokker-Planck equations. In this approach the distribution function is represented as a collection of Dirac delta functions with corresponding quadrature weights and locations, that are in turn determined from constraints based on evolution of generalized moments. Properties of the distribution function can be obtained by solution of transport equations for quadrature weights and locations. We will apply this computational approach to study a wide range of problems, including the Desai-Zwanzig Model (for nonlinear muscular contraction) and multivariate nonlinear Fokker-Planck equations describing classical fermions and bosons, and will also demonstrate good agreement with results obtained from Monte Carlo and other standard numerical methods.

  11. Approximate Uncertainty Modeling in Risk Analysis with Vine Copulas

    PubMed Central

    Bedford, Tim; Daneshkhah, Alireza

    2015-01-01

    Many applications of risk analysis require us to jointly model multiple uncertain quantities. Bayesian networks and copulas are two common approaches to modeling joint uncertainties with probability distributions. This article focuses on new methodologies for copulas by developing work of Cooke, Bedford, Kurowica, and others on vines as a way of constructing higher dimensional distributions that do not suffer from some of the restrictions of alternatives such as the multivariate Gaussian copula. The article provides a fundamental approximation result, demonstrating that we can approximate any density as closely as we like using vines. It further operationalizes this result by showing how minimum information copulas can be used to provide parametric classes of copulas that have such good levels of approximation. We extend previous approaches using vines by considering nonconstant conditional dependencies, which are particularly relevant in financial risk modeling. We discuss how such models may be quantified, in terms of expert judgment or by fitting data, and illustrate the approach by modeling two financial data sets. PMID:26332240

  12. Predicting the Distribution of Vibrio spp. in the Chesapeake Bay: A Vibrio cholerae Case Study

    PubMed Central

    Magny, Guillaume Constantin de; Long, Wen; Brown, Christopher W.; Hood, Raleigh R.; Huq, Anwar; Murtugudde, Raghu; Colwell, Rita R.

    2010-01-01

    Vibrio cholerae, the causative agent of cholera, is a naturally occurring inhabitant of the Chesapeake Bay and serves as a predictor for other clinically important vibrios, including Vibrio parahaemolyticus and Vibrio vulnificus. A system was constructed to predict the likelihood of the presence of V. cholerae in surface waters of the Chesapeake Bay, with the goal to provide forecasts of the occurrence of this and related pathogenic Vibrio spp. Prediction was achieved by driving an available multivariate empirical habitat model estimating the probability of V. cholerae within a range of temperatures and salinities in the Bay, with hydrodynamically generated predictions of ambient temperature and salinity. The experimental predictions provided both an improved understanding of the in situ variability of V. cholerae, including identification of potential hotspots of occurrence, and usefulness as an early warning system. With further development of the system, prediction of the probability of the occurrence of related pathogenic vibrios in the Chesapeake Bay, notably V. parahaemolyticus and V. vulnificus, will be possible, as well as its transport to any geographical location where sufficient relevant data are available. PMID:20145974

  13. Models of multidimensional discrete distribution of probabilities of random variables in information systems

    NASA Astrophysics Data System (ADS)

    Gromov, Yu Yu; Minin, Yu V.; Ivanova, O. G.; Morozova, O. N.

    2018-03-01

    Multidimensional discrete distributions of probabilities of independent random values were received. Their one-dimensional distribution is widely used in probability theory. Producing functions of those multidimensional distributions were also received.

  14. Calculating the individual probability of successful ocriplasmin treatment in eyes with VMT syndrome: a multivariable prediction model from the EXPORT study.

    PubMed

    Paul, Christoph; Heun, Christine; Müller, Hans-Helge; Hoerauf, Hans; Feltgen, Nicolas; Wachtlin, Joachim; Kaymak, Hakan; Mennel, Stefan; Koss, Michael Janusz; Fauser, Sascha; Maier, Mathias M; Schumann, Ricarda G; Mueller, Simone; Chang, Petrus; Schmitz-Valckenberg, Steffen; Kazerounian, Sara; Szurman, Peter; Lommatzsch, Albrecht; Bertelmann, Thomas

    2017-10-31

    To evaluate predictive factors for the treatment success of ocriplasmin and to use these factors to generate a multivariate model to calculate the individual probability of successful treatment. Data were collected in a retrospective, multicentre cohort study. Patients with vitreomacular traction (VMT) syndrome without a full-thickness macular hole were included if they received an intravitreal injection (IVI) of ocriplasmin. Five factors (age, gender, lens status, presence of epiretinal membrane (ERM) formation and horizontal diameter of VMT) were assessed on their association with VMT resolution. A multivariable logistic regression model was employed to further analyse these factors and calculate the individual probability of successful treatment. 167 eyes of 167 patients were included. Univariate analysis revealed a significant correlation to VMT resolution for all analysed factors: age (years) (OR 0.9208; 95% CI 0.8845 to 0.9586; p<0.0001), gender (male) (OR 0.480; 95% CI 0.241 to 0.957; p=0.0371), lens status (phakic) (OR 2.042; 95% CI 1.054 to 3.958; p=0.0344), ERM formation (present) (OR 0.384; 95% CI 0.179 to 0.821; p=0.0136) and horizontal VMT diameter (µm) (OR 0.99812; 95% CI 0.99684 to 0.99941, p=0.0042). A significant multivariable logistic regression model was established with age and VMT diameter. Known predictive factors for VMT resolution after ocriplasmin IVI were confirmed in our study. We were able to combine them into a formula, ultimately allowing the calculation of an individual probability of treatment success with ocriplasmin in patients with VMT syndrome without FTHM. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. Predicting Rib Fracture Risk With Whole-Body Finite Element Models: Development and Preliminary Evaluation of a Probabilistic Analytical Framework

    PubMed Central

    Forman, Jason L.; Kent, Richard W.; Mroz, Krystoffer; Pipkorn, Bengt; Bostrom, Ola; Segui-Gomez, Maria

    2012-01-01

    This study sought to develop a strain-based probabilistic method to predict rib fracture risk with whole-body finite element (FE) models, and to describe a method to combine the results with collision exposure information to predict injury risk and potential intervention effectiveness in the field. An age-adjusted ultimate strain distribution was used to estimate local rib fracture probabilities within an FE model. These local probabilities were combined to predict injury risk and severity within the whole ribcage. The ultimate strain distribution was developed from a literature dataset of 133 tests. Frontal collision simulations were performed with the THUMS (Total HUman Model for Safety) model with four levels of delta-V and two restraints: a standard 3-point belt and a progressive 3.5–7 kN force-limited, pretensioned (FL+PT) belt. The results of three simulations (29 km/h standard, 48 km/h standard, and 48 km/h FL+PT) were compared to matched cadaver sled tests. The numbers of fractures predicted for the comparison cases were consistent with those observed experimentally. Combining these results with field exposure informantion (ΔV, NASS-CDS 1992–2002) suggests a 8.9% probability of incurring AIS3+ rib fractures for a 60 year-old restrained by a standard belt in a tow-away frontal collision with this restraint, vehicle, and occupant configuration, compared to 4.6% for the FL+PT belt. This is the first study to describe a probabilistic framework to predict rib fracture risk based on strains observed in human-body FE models. Using this analytical framework, future efforts may incorporate additional subject or collision factors for multi-variable probabilistic injury prediction. PMID:23169122

  16. Statistical tests for whether a given set of independent, identically distributed draws comes from a specified probability density.

    PubMed

    Tygert, Mark

    2010-09-21

    We discuss several tests for determining whether a given set of independent and identically distributed (i.i.d.) draws does not come from a specified probability density function. The most commonly used are Kolmogorov-Smirnov tests, particularly Kuiper's variant, which focus on discrepancies between the cumulative distribution function for the specified probability density and the empirical cumulative distribution function for the given set of i.i.d. draws. Unfortunately, variations in the probability density function often get smoothed over in the cumulative distribution function, making it difficult to detect discrepancies in regions where the probability density is small in comparison with its values in surrounding regions. We discuss tests without this deficiency, complementing the classical methods. The tests of the present paper are based on the plain fact that it is unlikely to draw a random number whose probability is small, provided that the draw is taken from the same distribution used in calculating the probability (thus, if we draw a random number whose probability is small, then we can be confident that we did not draw the number from the same distribution used in calculating the probability).

  17. Generating Virtual Patients by Multivariate and Discrete Re-Sampling Techniques.

    PubMed

    Teutonico, D; Musuamba, F; Maas, H J; Facius, A; Yang, S; Danhof, M; Della Pasqua, O

    2015-10-01

    Clinical Trial Simulations (CTS) are a valuable tool for decision-making during drug development. However, to obtain realistic simulation scenarios, the patients included in the CTS must be representative of the target population. This is particularly important when covariate effects exist that may affect the outcome of a trial. The objective of our investigation was to evaluate and compare CTS results using re-sampling from a population pool and multivariate distributions to simulate patient covariates. COPD was selected as paradigm disease for the purposes of our analysis, FEV1 was used as response measure and the effects of a hypothetical intervention were evaluated in different populations in order to assess the predictive performance of the two methods. Our results show that the multivariate distribution method produces realistic covariate correlations, comparable to the real population. Moreover, it allows simulation of patient characteristics beyond the limits of inclusion and exclusion criteria in historical protocols. Both methods, discrete resampling and multivariate distribution generate realistic pools of virtual patients. However the use of a multivariate distribution enable more flexible simulation scenarios since it is not necessarily bound to the existing covariate combinations in the available clinical data sets.

  18. Multivariate Models for Normal and Binary Responses in Intervention Studies

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen

    2016-01-01

    Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…

  19. Joint modeling and registration of cell populations in cohorts of high-dimensional flow cytometric data.

    PubMed

    Pyne, Saumyadipta; Lee, Sharon X; Wang, Kui; Irish, Jonathan; Tamayo, Pablo; Nazaire, Marc-Danie; Duong, Tarn; Ng, Shu-Kay; Hafler, David; Levy, Ronald; Nolan, Garry P; Mesirov, Jill; McLachlan, Geoffrey J

    2014-01-01

    In biomedical applications, an experimenter encounters different potential sources of variation in data such as individual samples, multiple experimental conditions, and multivariate responses of a panel of markers such as from a signaling network. In multiparametric cytometry, which is often used for analyzing patient samples, such issues are critical. While computational methods can identify cell populations in individual samples, without the ability to automatically match them across samples, it is difficult to compare and characterize the populations in typical experiments, such as those responding to various stimulations or distinctive of particular patients or time-points, especially when there are many samples. Joint Clustering and Matching (JCM) is a multi-level framework for simultaneous modeling and registration of populations across a cohort. JCM models every population with a robust multivariate probability distribution. Simultaneously, JCM fits a random-effects model to construct an overall batch template--used for registering populations across samples, and classifying new samples. By tackling systems-level variation, JCM supports practical biomedical applications involving large cohorts. Software for fitting the JCM models have been implemented in an R package EMMIX-JCM, available from http://www.maths.uq.edu.au/~gjm/mix_soft/EMMIX-JCM/.

  20. Macroscopically constrained Wang-Landau method for systems with multiple order parameters and its application to drawing complex phase diagrams

    NASA Astrophysics Data System (ADS)

    Chan, C. H.; Brown, G.; Rikvold, P. A.

    2017-05-01

    A generalized approach to Wang-Landau simulations, macroscopically constrained Wang-Landau, is proposed to simulate the density of states of a system with multiple macroscopic order parameters. The method breaks a multidimensional random-walk process in phase space into many separate, one-dimensional random-walk processes in well-defined subspaces. Each of these random walks is constrained to a different set of values of the macroscopic order parameters. When the multivariable density of states is obtained for one set of values of fieldlike model parameters, the density of states for any other values of these parameters can be obtained by a simple transformation of the total system energy. All thermodynamic quantities of the system can then be rapidly calculated at any point in the phase diagram. We demonstrate how to use the multivariable density of states to draw the phase diagram, as well as order-parameter probability distributions at specific phase points, for a model spin-crossover material: an antiferromagnetic Ising model with ferromagnetic long-range interactions. The fieldlike parameters in this model are an effective magnetic field and the strength of the long-range interaction.

  1. Pyelocaliceal Distribution of Kidney Stones Used as an Outcome Predictor in Percutaneous Nephrolithotomy After Being Evaluated with Preoperative and Postoperative CT Scan.

    PubMed

    Tirapegui, Federico Ignacio; González, Mariano Sebastian; González, Ignacio Pablo Tobía; Daels, Francisco P

    2015-06-01

    To identify kidney stone characteristics that will determine either success or failure of a percutaneous nephrolithotomy (PCNL) and design a classification system to predict results according to these characteristics. One hundred thirty-eight patients were assessed with multislice abdominal and pelvic CT before and after PCNL. With regard to pyelocaliceal stone distribution, we classified our patients in two groups that we called "no extra stone in middle calix" (NESMC) and "extra stone in middle calix" (ESMC), according to the difficulty in reaching the stones. We did a univariate and a multivariate analysis, as well as a receiving operating curve (ROC) of the proposed classification, based on the foreseen probabilities, to determine the diagnostic yield. Global residual lithiasis (RL) was 26.08%. The proportion of patients with RL according to classification was NESMC 11.5% and ESMC 59.5%. In the univariate logistic regression analysis of the distribution, number, total volumetry, side, type, radio-opacity of stones, and the presence or not of preoperatory urinary tract infection, the variables related to RL were the distribution (11.3; 95% confidence interval [95% CI] 4.7, 27.4), volumetry (odds ratio [OR] 1.01; 95% CI 1.004, 1.014), and the presence of staghorn stones (OR 6.64; 95% CI 2.463, 17.905). In the multivariate analysis, distribution was statistically significant (OR 8.687; 95% CI 2.69, 28.06), whereas total volumetry and the presence of staghorn stones were not (OR 1; 95% CI 1.000, 1.000 and OR 2.7; 95% CI 0.35, 20.57, respectively). The ROC showed an area under the curve of 0.77. In our experience, the distribution of kidney stones is the most important predictor of RL after PCNL. The results also suggest that the presence of stones in the middle calix has a direct impact on the stone-free rate. We put forward a simple and reproducible classification, easy to apply, and useful to estimate the chances of success of the procedure using preoperatory CT scans.

  2. Asymptotic Distribution of the Likelihood Ratio Test Statistic for Sphericity of Complex Multivariate Normal Distribution.

    DTIC Science & Technology

    1981-08-01

    RATIO TEST STATISTIC FOR SPHERICITY OF COMPLEX MULTIVARIATE NORMAL DISTRIBUTION* C. Fang P. R. Krishnaiah B. N. Nagarsenker** August 1981 Technical...and their applications in time sEries, the reader is referred to Krishnaiah (1976). Motivated by the applications in the area of inference on multiple...for practical purposes. Here, we note that Krishnaiah , Lee and Chang (1976) approxi- mated the null distribution of certain power of the likeli

  3. A framework for multivariate data-based at-site flood frequency analysis: Essentiality of the conjugal application of parametric and nonparametric approaches

    NASA Astrophysics Data System (ADS)

    Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar

    2015-06-01

    In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.

  4. Stereotactic probability and variability of speech arrest and anomia sites during stimulation mapping of the language dominant hemisphere.

    PubMed

    Chang, Edward F; Breshears, Jonathan D; Raygor, Kunal P; Lau, Darryl; Molinaro, Annette M; Berger, Mitchel S

    2017-01-01

    OBJECTIVE Functional mapping using direct cortical stimulation is the gold standard for the prevention of postoperative morbidity during resective surgery in dominant-hemisphere perisylvian regions. Its role is necessitated by the significant interindividual variability that has been observed for essential language sites. The aim in this study was to determine the statistical probability distribution of eliciting aphasic errors for any given stereotactically based cortical position in a patient cohort and to quantify the variability at each cortical site. METHODS Patients undergoing awake craniotomy for dominant-hemisphere primary brain tumor resection between 1999 and 2014 at the authors' institution were included in this study, which included counting and picture-naming tasks during dense speech mapping via cortical stimulation. Positive and negative stimulation sites were collected using an intraoperative frameless stereotactic neuronavigation system and were converted to Montreal Neurological Institute coordinates. Data were iteratively resampled to create mean and standard deviation probability maps for speech arrest and anomia. Patients were divided into groups with a "classic" or an "atypical" location of speech function, based on the resultant probability maps. Patient and clinical factors were then assessed for their association with an atypical location of speech sites by univariate and multivariate analysis. RESULTS Across 102 patients undergoing speech mapping, the overall probabilities of speech arrest and anomia were 0.51 and 0.33, respectively. Speech arrest was most likely to occur with stimulation of the posterior inferior frontal gyrus (maximum probability from individual bin = 0.025), and variance was highest in the dorsal premotor cortex and the posterior superior temporal gyrus. In contrast, stimulation within the posterior perisylvian cortex resulted in the maximum mean probability of anomia (maximum probability = 0.012), with large variance in the regions surrounding the posterior superior temporal gyrus, including the posterior middle temporal, angular, and supramarginal gyri. Patients with atypical speech localization were far more likely to have tumors in canonical Broca's or Wernicke's areas (OR 7.21, 95% CI 1.67-31.09, p < 0.01) or to have multilobar tumors (OR 12.58, 95% CI 2.22-71.42, p < 0.01), than were patients with classic speech localization. CONCLUSIONS This study provides statistical probability distribution maps for aphasic errors during cortical stimulation mapping in a patient cohort. Thus, the authors provide an expected probability of inducing speech arrest and anomia from specific 10-mm 2 cortical bins in an individual patient. In addition, they highlight key regions of interindividual mapping variability that should be considered preoperatively. They believe these results will aid surgeons in their preoperative planning of eloquent cortex resection.

  5. Numerical analysis of the accuracy of bivariate quantile distributions utilizing copulas compared to the GUM supplement 2 for oil pressure balance uncertainties

    NASA Astrophysics Data System (ADS)

    Ramnath, Vishal

    2017-11-01

    In the field of pressure metrology the effective area is Ae = A0 (1 + λP) where A0 is the zero-pressure area and λ is the distortion coefficient and the conventional practise is to construct univariate probability density functions (PDFs) for A0 and λ. As a result analytical generalized non-Gaussian bivariate joint PDFs has not featured prominently in pressure metrology. Recently extended lambda distribution based quantile functions have been successfully utilized for summarizing univariate arbitrary PDF distributions of gas pressure balances. Motivated by this development we investigate the feasibility and utility of extending and applying quantile functions to systems which naturally exhibit bivariate PDFs. Our approach is to utilize the GUM Supplement 1 methodology to solve and generate Monte Carlo based multivariate uncertainty data for an oil based pressure balance laboratory standard that is used to generate known high pressures, and which are in turn cross-floated against another pressure balance transfer standard in order to deduce the transfer standard's respective area. We then numerically analyse the uncertainty data by formulating and constructing an approximate bivariate quantile distribution that directly couples A0 and λ in order to compare and contrast its accuracy to an exact GUM Supplement 2 based uncertainty quantification analysis.

  6. Analysis of vector wind change with respect to time for Cape Kennedy, Florida

    NASA Technical Reports Server (NTRS)

    Adelfang, S. I.

    1978-01-01

    Multivariate analysis was used to determine the joint distribution of the four variables represented by the components of the wind vector at an initial time and after a specified elapsed time is hypothesized to be quadravariate normal; the fourteen statistics of this distribution, calculated from 15 years of twice-daily rawinsonde data are presented by monthly reference periods for each month from 0 to 27 km. The hypotheses that the wind component changes with respect to time is univariate normal, that the joint distribution of wind component change with respect to time is univariate normal, that the joint distribution of wind component changes is bivariate normal, and that the modulus of vector wind change is Rayleigh are tested by comparison with observed distributions. Statistics of the conditional bivariate normal distributions of vector wind at a future time given the vector wind at an initial time are derived. Wind changes over time periods from 1 to 5 hours, calculated from Jimsphere data, are presented. Extension of the theoretical prediction (based on rawinsonde data) of wind component change standard deviation to time periods of 1 to 5 hours falls (with a few exceptions) within the 95 percentile confidence band of the population estimate obtained from the Jimsphere sample data. The joint distributions of wind change components, conditional wind components, and 1 km vector wind shear change components are illustrated by probability ellipses at the 95 percentile level.

  7. Investigation of Dielectric Breakdown Characteristics for Double-break Vacuum Interrupter and Dielectric Breakdown Probability Distribution in Vacuum Interrupter

    NASA Astrophysics Data System (ADS)

    Shioiri, Tetsu; Asari, Naoki; Sato, Junichi; Sasage, Kosuke; Yokokura, Kunio; Homma, Mitsutaka; Suzuki, Katsumi

    To investigate the reliability of equipment of vacuum insulation, a study was carried out to clarify breakdown probability distributions in vacuum gap. Further, a double-break vacuum circuit breaker was investigated for breakdown probability distribution. The test results show that the breakdown probability distribution of the vacuum gap can be represented by a Weibull distribution using a location parameter, which shows the voltage that permits a zero breakdown probability. The location parameter obtained from Weibull plot depends on electrode area. The shape parameter obtained from Weibull plot of vacuum gap was 10∼14, and is constant irrespective non-uniform field factor. The breakdown probability distribution after no-load switching can be represented by Weibull distribution using a location parameter. The shape parameter after no-load switching was 6∼8.5, and is constant, irrespective of gap length. This indicates that the scatter of breakdown voltage was increased by no-load switching. If the vacuum circuit breaker uses a double break, breakdown probability at low voltage becomes lower than single-break probability. Although potential distribution is a concern in the double-break vacuum cuicuit breaker, its insulation reliability is better than that of the single-break vacuum interrupter even if the bias of the vacuum interrupter's sharing voltage is taken into account.

  8. Head and facial anthropometry of mixed-race US Army male soldiers for military design and sizing: a pilot study.

    PubMed

    Yokota, Miyo

    2005-05-01

    In the United States, the biologically admixed population is increasing. Such demographic changes may affect the distribution of anthropometric characteristics, which are incorporated into the design of equipment and clothing for the US Army and other large organizations. The purpose of this study was to examine multivariate craniofacial anthropometric distributions between biologically admixed male populations and single racial groups of Black and White males. Multivariate statistical results suggested that nose breadth and lip length were different between Blacks and Whites. Such differences may be considered for adjustments to respirators and chemical-biological protective masks. However, based on this pilot study, multivariate anthropometric distributions of admixed individuals were within the distributions of single racial groups. Based on the sample reported, sizing and designing for the admixed groups are not necessary if anthropometric distributions of single racial groups comprising admixed groups are known.

  9. Using a Betabinomial distribution to estimate the prevalence of adherence to physical activity guidelines among children and youth.

    PubMed

    Garriguet, Didier

    2016-04-01

    Estimates of the prevalence of adherence to physical activity guidelines in the population are generally the result of averaging individual probability of adherence based on the number of days people meet the guidelines and the number of days they are assessed. Given this number of active and inactive days (days assessed minus days active), the conditional probability of meeting the guidelines that has been used in the past is a Beta (1 + active days, 1 + inactive days) distribution assuming the probability p of a day being active is bounded by 0 and 1 and averages 50%. A change in the assumption about the distribution of p is required to better match the discrete nature of the data and to better assess the probability of adherence when the percentage of active days in the population differs from 50%. Using accelerometry data from the Canadian Health Measures Survey, the probability of adherence to physical activity guidelines is estimated using a conditional probability given the number of active and inactive days distributed as a Betabinomial(n, a + active days , β + inactive days) assuming that p is randomly distributed as Beta(a, β) where the parameters a and β are estimated by maximum likelihood. The resulting Betabinomial distribution is discrete. For children aged 6 or older, the probability of meeting physical activity guidelines 7 out of 7 days is similar to published estimates. For pre-schoolers, the Betabinomial distribution yields higher estimates of adherence to the guidelines than the Beta distribution, in line with the probability of being active on any given day. In estimating the probability of adherence to physical activity guidelines, the Betabinomial distribution has several advantages over the previously used Beta distribution. It is a discrete distribution and maximizes the richness of accelerometer data.

  10. The choice of prior distribution for a covariance matrix in multivariate meta-analysis: a simulation study.

    PubMed

    Hurtado Rúa, Sandra M; Mazumdar, Madhu; Strawderman, Robert L

    2015-12-30

    Bayesian meta-analysis is an increasingly important component of clinical research, with multivariate meta-analysis a promising tool for studies with multiple endpoints. Model assumptions, including the choice of priors, are crucial aspects of multivariate Bayesian meta-analysis (MBMA) models. In a given model, two different prior distributions can lead to different inferences about a particular parameter. A simulation study was performed in which the impact of families of prior distributions for the covariance matrix of a multivariate normal random effects MBMA model was analyzed. Inferences about effect sizes were not particularly sensitive to prior choice, but the related covariance estimates were. A few families of prior distributions with small relative biases, tight mean squared errors, and close to nominal coverage for the effect size estimates were identified. Our results demonstrate the need for sensitivity analysis and suggest some guidelines for choosing prior distributions in this class of problems. The MBMA models proposed here are illustrated in a small meta-analysis example from the periodontal field and a medium meta-analysis from the study of stroke. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.

  11. The Structure of the Young Star Cluster NGC 6231. II. Structure, Formation, and Fate

    NASA Astrophysics Data System (ADS)

    Kuhn, Michael A.; Getman, Konstantin V.; Feigelson, Eric D.; Sills, Alison; Gromadzki, Mariusz; Medina, Nicolás; Borissova, Jordanka; Kurtev, Radostin

    2017-12-01

    The young cluster NGC 6231 (stellar ages ˜2-7 Myr) is observed shortly after star formation activity has ceased. Using the catalog of 2148 probable cluster members obtained from Chandra, VVV, and optical surveys (Paper I), we examine the cluster’s spatial structure and dynamical state. The spatial distribution of stars is remarkably well fit by an isothermal sphere with moderate elongation, while other commonly used models like Plummer spheres, multivariate normal distributions, or power-law models are poor fits. The cluster has a core radius of 1.2 ± 0.1 pc and a central density of ˜200 stars pc-3. The distribution of stars is mildly mass segregated. However, there is no radial stratification of the stars by age. Although most of the stars belong to a single cluster, a small subcluster of stars is found superimposed on the main cluster, and there are clumpy non-isotropic distributions of stars outside ˜4 core radii. When the size, mass, and age of NGC 6231 are compared to other young star clusters and subclusters in nearby active star-forming regions, it lies at the high-mass end of the distribution but along the same trend line. This could result from similar formation processes, possibly hierarchical cluster assembly. We argue that NGC 6231 has expanded from its initial size but that it remains gravitationally bound.

  12. TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS

    PubMed Central

    Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.

    2017-01-01

    Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971

  13. Global Pyrogeography: the Current and Future Distribution of Wildfire

    PubMed Central

    Krawchuk, Meg A.; Moritz, Max A.; Parisien, Marc-André; Van Dorn, Jeff; Hayhoe, Katharine

    2009-01-01

    Climate change is expected to alter the geographic distribution of wildfire, a complex abiotic process that responds to a variety of spatial and environmental gradients. How future climate change may alter global wildfire activity, however, is still largely unknown. As a first step to quantifying potential change in global wildfire, we present a multivariate quantification of environmental drivers for the observed, current distribution of vegetation fires using statistical models of the relationship between fire activity and resources to burn, climate conditions, human influence, and lightning flash rates at a coarse spatiotemporal resolution (100 km, over one decade). We then demonstrate how these statistical models can be used to project future changes in global fire patterns, highlighting regional hotspots of change in fire probabilities under future climate conditions as simulated by a global climate model. Based on current conditions, our results illustrate how the availability of resources to burn and climate conditions conducive to combustion jointly determine why some parts of the world are fire-prone and others are fire-free. In contrast to any expectation that global warming should necessarily result in more fire, we find that regional increases in fire probabilities may be counter-balanced by decreases at other locations, due to the interplay of temperature and precipitation variables. Despite this net balance, our models predict substantial invasion and retreat of fire across large portions of the globe. These changes could have important effects on terrestrial ecosystems since alteration in fire activity may occur quite rapidly, generating ever more complex environmental challenges for species dispersing and adjusting to new climate conditions. Our findings highlight the potential for widespread impacts of climate change on wildfire, suggesting severely altered fire regimes and the need for more explicit inclusion of fire in research on global vegetation-climate change dynamics and conservation planning. PMID:19352494

  14. Species distribution modelling for plant communities: Stacked single species or multivariate modelling approaches?

    Treesearch

    Emilie B. Henderson; Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Harold S.J. Zald

    2014-01-01

    Landscape management and conservation planning require maps of vegetation composition and structure over large regions. Species distribution models (SDMs) are often used for individual species, but projects mapping multiple species are rarer. We compare maps of plant community composition assembled by stacking results from many SDMs with multivariate maps constructed...

  15. Distributions of Characteristic Roots in Multivariate Analysis

    DTIC Science & Technology

    1976-07-01

    stiidied by various authors, have been briefly discussed. Such distributional ies of four test criteria and a few less important ones which are...functions h. -nots have further been discussed in view of the power comparisons made in co. ion wich tests of three multivariate hypotheses. In addition...one- sample case has also been considered in terms of distributional aspects of the ch. roots and criteria for tests of two hypotheses on the

  16. Simulating Multivariate Nonnormal Data Using an Iterative Algorithm

    ERIC Educational Resources Information Center

    Ruscio, John; Kaczetow, Walter

    2008-01-01

    Simulating multivariate nonnormal data with specified correlation matrices is difficult. One especially popular method is Vale and Maurelli's (1983) extension of Fleishman's (1978) polynomial transformation technique to multivariate applications. This requires the specification of distributional moments and the calculation of an intermediate…

  17. Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions.

    PubMed

    Dinov, Ivo D; Siegrist, Kyle; Pearl, Dennis K; Kalinin, Alexandr; Christou, Nicolas

    2016-06-01

    Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome , which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols.

  18. Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions

    PubMed Central

    Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis K.; Kalinin, Alexandr; Christou, Nicolas

    2015-01-01

    Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome, which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols. PMID:27158191

  19. Robust multivariate nonparametric tests for detection of two-sample location shift in clinical trials

    PubMed Central

    Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo

    2018-01-01

    This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555

  20. Random Partition Distribution Indexed by Pairwise Information

    PubMed Central

    Dahl, David B.; Day, Ryan; Tsai, Jerry W.

    2017-01-01

    We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g., hierarchical clustering), but we show how to use this type of information to define a prior partition distribution for flexible Bayesian modeling. A defining feature of the distribution is that it allocates probability among partitions within a given number of subsets, but it does not shift probability among sets of partitions with different numbers of subsets. Our distribution places more probability on partitions that group similar items yet keeps the total probability of partitions with a given number of subsets constant. The distribution of the number of subsets (and its moments) is available in closed-form and is not a function of the similarities. Our formulation has an explicit probability mass function (with a tractable normalizing constant) so the full suite of MCMC methods may be used for posterior inference. We compare our distribution with several existing partition distributions, showing that our formulation has attractive properties. We provide three demonstrations to highlight the features and relative performance of our distribution. PMID:29276318

  1. Post-processing of multi-model ensemble river discharge forecasts using censored EMOS

    NASA Astrophysics Data System (ADS)

    Hemri, Stephan; Lisniak, Dmytro; Klein, Bastian

    2014-05-01

    When forecasting water levels and river discharge, ensemble weather forecasts are used as meteorological input to hydrologic process models. As hydrologic models are imperfect and the input ensembles tend to be biased and underdispersed, the output ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, statistical post-processing is required in order to achieve calibrated and sharp predictions. Standard post-processing methods such as Ensemble Model Output Statistics (EMOS) that have their origins in meteorological forecasting are now increasingly being used in hydrologic applications. Here we consider two sub-catchments of River Rhine, for which the forecasting system of the Federal Institute of Hydrology (BfG) uses runoff data that are censored below predefined thresholds. To address this methodological challenge, we develop a censored EMOS method that is tailored to such data. The censored EMOS forecast distribution can be understood as a mixture of a point mass at the censoring threshold and a continuous part based on a truncated normal distribution. Parameter estimates of the censored EMOS model are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over the training dataset. Model fitting on Box-Cox transformed data allows us to take account of the positive skewness of river discharge distributions. In order to achieve realistic forecast scenarios over an entire range of lead-times, there is a need for multivariate extensions. To this end, we smooth the marginal parameter estimates over lead-times. In order to obtain realistic scenarios of discharge evolution over time, the marginal distributions have to be linked with each other. To this end, the multivariate dependence structure can either be adopted from the raw ensemble like in Ensemble Copula Coupling (ECC), or be estimated from observations in a training period. The censored EMOS model has been applied to multi-model ensemble forecasts issued on a daily basis over a period of three years. For the two catchments considered, this resulted in well calibrated and sharp forecast distributions over all lead-times from 1 to 114 h. Training observations tended to be better indicators for the dependence structure than the raw ensemble.

  2. Aspects of posttraumatic stress disorder in long-term testicular cancer survivors: cross-sectional and longitudinal findings.

    PubMed

    Dahl, Alv A; Østby-Deglum, Marie; Oldenburg, Jan; Bremnes, Roy; Dahl, Olav; Klepp, Olbjørn; Wist, Erik; Fosså, Sophie D

    2016-10-01

    The purpose of this research is to study the prevalence of posttraumatic stress disorder (PTSD) and variables associated with PTSD in Norwegian long-term testicular cancer survivors (TCSs) both cross-sectionally and longitudinally. At a mean of 11 years after diagnosis, 1418 TCSs responded to a mailed questionnaire, and at a mean of 19 years after diagnosis, 1046 of them responded again to a modified questionnaire. Posttraumatic symptoms related to testicular cancer were self-rated with the Impact of Event Scale (IES) at the 11-year study only. An IES total score ≥35 defined Full PTSD, and a score 26-34 identified Partial PTSD, and the combination of Full and Partial PTSD defined Probable PTSD. At the 11-year study, 4.5 % had Full PTSD, 6.4 % had Partial PTSD, and 10.9 % Probable had PTSD. At both studies, socio-demographic variables, somatic health, anxiety/depression, chronic fatigue, and neurotoxic adverse effects were significantly associated with Probable PTSD in bivariate analyses. Probable anxiety disorder, poor self-rated health, and neurotoxicity remained significant with Probable PTSD in multivariate analyses at the 11-year study. In bivariate analyses, probable PTSD at that time significantly predicted socio-demographic variables, somatic health, anxiety/depression, chronic fatigue, and neurotoxicity among participants of the 19-year study, but only probable anxiety disorder remained significant in multivariable analysis. In spite of excellent prognosis, 10.9 % of long-term testicular cancer survivors had Probable PTSD at a mean of 11 years after diagnosis. Probable PTSD was significantly associated with a broad range of problems both at that time and was predictive of considerable problems at a mean of 19 year postdiagnosis. Among long-term testicular cancer survivors, 10.9 % have Probable PTSD with many associated problems, and therefore health personnel should explore stress symptoms at follow-up since efficient treatments are available.

  3. A brief introduction to probability.

    PubMed

    Di Paola, Gioacchino; Bertani, Alessandro; De Monte, Lavinia; Tuzzolino, Fabio

    2018-02-01

    The theory of probability has been debated for centuries: back in 1600, French mathematics used the rules of probability to place and win bets. Subsequently, the knowledge of probability has significantly evolved and is now an essential tool for statistics. In this paper, the basic theoretical principles of probability will be reviewed, with the aim of facilitating the comprehension of statistical inference. After a brief general introduction on probability, we will review the concept of the "probability distribution" that is a function providing the probabilities of occurrence of different possible outcomes of a categorical or continuous variable. Specific attention will be focused on normal distribution that is the most relevant distribution applied to statistical analysis.

  4. A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research.

    PubMed

    Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; Nookala, Lavanya; Day, Michele E; Kim, Katherine K; Kim, Hyeoneui; Boxwala, Aziz; El-Kareh, Robert; Kuo, Grace M; Resnic, Frederic S; Kesselman, Carl; Ohno-Machado, Lucila

    2015-11-01

    Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  5. LFSPMC: Linear feature selection program using the probability of misclassification

    NASA Technical Reports Server (NTRS)

    Guseman, L. F., Jr.; Marion, B. P.

    1975-01-01

    The computational procedure and associated computer program for a linear feature selection technique are presented. The technique assumes that: a finite number, m, of classes exists; each class is described by an n-dimensional multivariate normal density function of its measurement vectors; the mean vector and covariance matrix for each density function are known (or can be estimated); and the a priori probability for each class is known. The technique produces a single linear combination of the original measurements which minimizes the one-dimensional probability of misclassification defined by the transformed densities.

  6. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.

  7. Association between sex, systemic iron variation and probability of Parkinson's disease.

    PubMed

    Mariani, S; Ventriglia, M; Simonelli, I; Bucossi, S; Siotto, M; Donno, S; Vernieri, F; Squitti, R

    2016-01-01

    Iron homeostasis appears altered in Parkinson's disease (PD). Recent genetic studies and meta-analyses have produced heterogeneous and inconclusive results. In order to verify the possible role of iron status in PD, we have screened some of the main metal gene variants, evaluated their effects on iron systemic status, and checked for possible interactions with PD. In 92 PD patients and 112 healthy controls, we screened the D544E and R793H variants of the ceruloplasmin gene (CP), the P589S variant of the transferrin gene (TF), and the H63D and C282Y variants of the HFE gene, encoding for homologous proteins, respectively. Furthermore, we analyzed serum concentrations of iron, copper and their related proteins. The genetic investigation revealed no significant differences in allelic and genotype distributions between patients and controls. Two different multivariable forward stepwise logistic models showed that, when the effect of sex is considered, an increase of the probability of having PD is associated with low iron concentration and Tf-saturation. This study provides new evidence of the involvement of iron metabolism in PD pathogenesis and reveals a biological effect of sex.

  8. Digital simulation of two-dimensional random fields with arbitrary power spectra and non-Gaussian probability distribution functions.

    PubMed

    Yura, Harold T; Hanson, Steen G

    2012-04-01

    Methods for simulation of two-dimensional signals with arbitrary power spectral densities and signal amplitude probability density functions are disclosed. The method relies on initially transforming a white noise sample set of random Gaussian distributed numbers into a corresponding set with the desired spectral distribution, after which this colored Gaussian probability distribution is transformed via an inverse transform into the desired probability distribution. In most cases the method provides satisfactory results and can thus be considered an engineering approach. Several illustrative examples with relevance for optics are given.

  9. The global impact distribution of Near-Earth objects

    NASA Astrophysics Data System (ADS)

    Rumpf, Clemens; Lewis, Hugh G.; Atkinson, Peter M.

    2016-02-01

    Asteroids that could collide with the Earth are listed on the publicly available Near-Earth object (NEO) hazard web sites maintained by the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA). The impact probability distribution of 69 potentially threatening NEOs from these lists that produce 261 dynamically distinct impact instances, or Virtual Impactors (VIs), were calculated using the Asteroid Risk Mitigation and Optimization Research (ARMOR) tool in conjunction with OrbFit. ARMOR projected the impact probability of each VI onto the surface of the Earth as a spatial probability distribution. The projection considers orbit solution accuracy and the global impact probability. The method of ARMOR is introduced and the tool is validated against two asteroid-Earth collision cases with objects 2008 TC3 and 2014 AA. In the analysis, the natural distribution of impact corridors is contrasted against the impact probability distribution to evaluate the distributions' conformity with the uniform impact distribution assumption. The distribution of impact corridors is based on the NEO population and orbital mechanics. The analysis shows that the distribution of impact corridors matches the common assumption of uniform impact distribution and the result extends the evidence base for the uniform assumption from qualitative analysis of historic impact events into the future in a quantitative way. This finding is confirmed in a parallel analysis of impact points belonging to a synthetic population of 10,006 VIs. Taking into account the impact probabilities introduced significant variation into the results and the impact probability distribution, consequently, deviates markedly from uniformity. The concept of impact probabilities is a product of the asteroid observation and orbit determination technique and, thus, represents a man-made component that is largely disconnected from natural processes. It is important to consider impact probabilities because such information represents the best estimate of where an impact might occur.

  10. Adjuvant Chemotherapy Improves the Probability of Freedom From Recurrence in Patients With Resected Stage IB Lung Adenocarcinoma.

    PubMed

    Hung, Jung-Jyh; Wu, Yu-Chung; Chou, Teh-Ying; Jeng, Wen-Juei; Yeh, Yi-Chen; Hsu, Wen-Hu

    2016-04-01

    The benefit of adjuvant chemotherapy remains controversial for patients with stage IB non-small-cell lung cancer (NSCLC). This study investigated the effect of adjuvant chemotherapy and the predictors of benefit from adjuvant chemotherapy in patients with stage IB lung adenocarcinoma. A total of 243 patients with completely resected pathologic stage IB lung adenocarcinoma were included in the study. Predictors of the benefits of improved overall survival (OS) or probability of freedom from recurrence (FFR) from platinum-based adjuvant chemotherapy in patients with resected stage IB lung adenocarcinoma were investigated. Among the 243 patients, 70 (28.8%) had received platinum-based doublet adjuvant chemotherapy. A micropapillary/solid-predominant pattern (versus an acinar/papillary-predominant pattern) was a significantly worse prognostic factor for probability of FFR (p = 0.033). Although adjuvant chemotherapy (versus surgical intervention alone) was not a significant prognostic factor for OS (p = 0.303), it was a significant prognostic factor for a better probability of FFR (p = 0.029) on multivariate analysis. In propensity-score-matched pairs, there was no significant difference in OS between patients who received adjuvant chemotherapy and those who did not (p = 0.386). Patients who received adjuvant chemotherapy had a significantly better probability of FFR than those who did not (p = 0.043). For patients with a predominantly micropapillary/solid pattern, adjuvant chemotherapy (p = 0.033) was a significant prognostic factor for a better probability of FFR on multivariate analysis. Adjuvant chemotherapy is a favorable prognostic factor for the probability of FFR in patients with stage IB lung adenocarcinoma, particularly in those with a micropapillary/solid-predominant pattern. Copyright © 2016 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  11. A Self-Organizing Maps approach to assess the wave climate of the Adriatic Sea

    NASA Astrophysics Data System (ADS)

    Barbariol, Francesco; Marcello Falcieri, Francesco; Scotton, Carlotta; Benetazzo, Alvise; Bergamasco, Andrea; Bergamasco, Filippo; Bonaldo, Davide; Carniel, Sandro; Sclavo, Mauro

    2015-04-01

    The assessment of wave conditions at sea is fruitful for many research fields in marine and atmospheric sciences and for the human activities in the marine environment. To this end, in the last decades the observational network, that mostly relies on buoys, satellites and other probes from fixed platforms, has been integrated with numerical models outputs, which allow to compute the parameters of sea states (e.g. the significant wave height, the mean and peak wave periods, the mean and peak wave directions) over wider regions. Apart from the collection of wave parameters observed at specific sites or modeled on arbitrary domains, the data processing performed to infer the wave climate at those sites is a crucial step in order to provide high quality data and information to the community. In this context, several statistical techniques has been used to model the randomness of wave parameters. While univariate and bivariate probability distribution functions (pdf) are routinely used, multivariate pdfs that model the probability structure of more than two wave parameters are hardly managed. Recently, the Self-Organizing Maps (SOM) technique has been successfully applied to represent the multivariate random wave climate at sites around the Iberian peninsula and the South America continent. Indeed, the visualization properties offered by this technique allow to get the dependencies between the different parameters by visual inspection. In this study, carried out in the frame of the Italian National Flagship Project "RITMARE", we take advantage of the SOM technique to assess the multivariate wave climate over the Adriatic Sea, a semi-enclosed basin in the north-eastern Mediterranean Sea, where winds from North-East (called "Bora") and South-East (called "Sirocco") mainly blow causing sea storms. By means of the SOM techniques we can observe the multivariate character of the typical Bora and Sirocco wave features in the Adriatic Sea. To this end, we used both observed and modeled wave parameters. The "Acqua Alta" oceanographic tower in the northern Adriatic Sea (ISMAR-CNR) and the Italian Data Buoy Network (RON, managed by ISPRA) off the western Adriatic coasts furnished the wave parameters at specific sites of interest. Widespread wave parameters were obtained by means of a numerical SWAN wave model that was implemented on the whole Adriatic Sea with a 6x6 km2 resolution and forced by the high resolution COSMO-I7 atmospheric model for the period 2007-2013.

  12. Nonadditive entropies yield probability distributions with biases not warranted by the data.

    PubMed

    Pressé, Steve; Ghosh, Kingshuk; Lee, Julian; Dill, Ken A

    2013-11-01

    Different quantities that go by the name of entropy are used in variational principles to infer probability distributions from limited data. Shore and Johnson showed that maximizing the Boltzmann-Gibbs form of the entropy ensures that probability distributions inferred satisfy the multiplication rule of probability for independent events in the absence of data coupling such events. Other types of entropies that violate the Shore and Johnson axioms, including nonadditive entropies such as the Tsallis entropy, violate this basic consistency requirement. Here we use the axiomatic framework of Shore and Johnson to show how such nonadditive entropy functions generate biases in probability distributions that are not warranted by the underlying data.

  13. Autumn larval fish assemblages in the northwest African Atlantic coastal zone

    NASA Astrophysics Data System (ADS)

    Abdelouahab, Hinde; Berraho, Amina; Baibai, Tarik; Agouzouk, Aziz; Makaoui, Ahmed; Errhif, Ahmed

    2017-05-01

    A study on the assemblage composition and vertical distribution of larval fish was conducted in the southern area of the Moroccan Atlantic coast in Autumn 2011. A total of 1 680 fish larvae taxa were identified from 21 families. The majority of the larvae were present in the upper layers. Clupeids were the most abundant larvae taxa followed by Myctophidae, Gadidae and Sparidae, hence the larval fish assemblages (LFA) were variable in diff erent depth layers. Total fish larvae showed a preference for surface layers, and were mainly found above 75 m depth, with some exceptions. The maximum concentration of fish larvae was concentrated up to 25 m essentially above the thermocline, where chlorophyll a and mesozooplankton were abundant. Spatially, neritic families were located near the coast and at some off shore stations especially in the northern part, while oceanic families were more distributed towards off shore along the study area. Cluster analysis showed a segregation of two groups of larvae. However, a clear separation between neritic families and oceanic families was not found. Multivariate analysis highlighted the relationship between the distribution of larvae of diff erent families and environmental parameters. Temperature and salinity seem to have been the factors that acted on associations of fish larvae. Day/night vertical distributions suggest there was not a very significant vertical migration, probably due to adequate light levels for feeding.

  14. PRODIGEN: visualizing the probability landscape of stochastic gene regulatory networks in state and time space.

    PubMed

    Ma, Chihua; Luciani, Timothy; Terebus, Anna; Liang, Jie; Marai, G Elisabeta

    2017-02-15

    Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists' understanding of phenotypic behavior associated with specific genes. We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases.

  15. Comparision of the different probability distributions for earthquake hazard assessment in the North Anatolian Fault Zone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yilmaz, Şeyda, E-mail: seydayilmaz@ktu.edu.tr; Bayrak, Erdem, E-mail: erdmbyrk@gmail.com; Bayrak, Yusuf, E-mail: bayrak@ktu.edu.tr

    In this study we examined and compared the three different probabilistic distribution methods for determining the best suitable model in probabilistic assessment of earthquake hazards. We analyzed a reliable homogeneous earthquake catalogue between a time period 1900-2015 for magnitude M ≥ 6.0 and estimated the probabilistic seismic hazard in the North Anatolian Fault zone (39°-41° N 30°-40° E) using three distribution methods namely Weibull distribution, Frechet distribution and three-parameter Weibull distribution. The distribution parameters suitability was evaluated Kolmogorov-Smirnov (K-S) goodness-of-fit test. We also compared the estimated cumulative probability and the conditional probabilities of occurrence of earthquakes for different elapsed timemore » using these three distribution methods. We used Easyfit and Matlab software to calculate these distribution parameters and plotted the conditional probability curves. We concluded that the Weibull distribution method was the most suitable than other distribution methods in this region.« less

  16. Incorporating Skew into RMS Surface Roughness Probability Distribution

    NASA Technical Reports Server (NTRS)

    Stahl, Mark T.; Stahl, H. Philip.

    2013-01-01

    The standard treatment of RMS surface roughness data is the application of a Gaussian probability distribution. This handling of surface roughness ignores the skew present in the surface and overestimates the most probable RMS of the surface, the mode. Using experimental data we confirm the Gaussian distribution overestimates the mode and application of an asymmetric distribution provides a better fit. Implementing the proposed asymmetric distribution into the optical manufacturing process would reduce the polishing time required to meet surface roughness specifications.

  17. Modeling of turbulent supersonic H2-air combustion with a multivariate beta PDF

    NASA Technical Reports Server (NTRS)

    Baurle, R. A.; Hassan, H. A.

    1993-01-01

    Recent calculations of turbulent supersonic reacting shear flows using an assumed multivariate beta PDF (probability density function) resulted in reduced production rates and a delay in the onset of combustion. This result is not consistent with available measurements. The present research explores two possible reasons for this behavior: use of PDF's that do not yield Favre averaged quantities, and the gradient diffusion assumption. A new multivariate beta PDF involving species densities is introduced which makes it possible to compute Favre averaged mass fractions. However, using this PDF did not improve comparisons with experiment. A countergradient diffusion model is then introduced. Preliminary calculations suggest this to be the cause of the discrepancy.

  18. Robustness of Multiple Objective Decision Analysis Preference Functions

    DTIC Science & Technology

    2002-06-01

    p p′ : The probability of some event. ,i ip q : The probability of event . i Π : An aggregation of proportional data used in calculating a test ...statistical tests of the significance of the term and also is conducted in a multivariate framework rather than the ROSA univariate approach. A...residual error is ˆ−e = y y (45) The coefficient provides a ready indicator of the contribution for the associated variable and statistical tests

  19. The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions

    PubMed Central

    Larget, Bret

    2013-01-01

    In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066

  20. Vertical changes in the probability distribution of downward irradiance within the near-surface ocean under sunny conditions

    NASA Astrophysics Data System (ADS)

    Gernez, Pierre; Stramski, Dariusz; Darecki, Miroslaw

    2011-07-01

    Time series measurements of fluctuations in underwater downward irradiance, Ed, within the green spectral band (532 nm) show that the probability distribution of instantaneous irradiance varies greatly as a function of depth within the near-surface ocean under sunny conditions. Because of intense light flashes caused by surface wave focusing, the near-surface probability distributions are highly skewed to the right and are heavy tailed. The coefficients of skewness and excess kurtosis at depths smaller than 1 m can exceed 3 and 20, respectively. We tested several probability models, such as lognormal, Gumbel, Fréchet, log-logistic, and Pareto, which are potentially suited to describe the highly skewed heavy-tailed distributions. We found that the models cannot approximate with consistently good accuracy the high irradiance values within the right tail of the experimental distribution where the probability of these values is less than 10%. This portion of the distribution corresponds approximately to light flashes with Ed > 1.5?, where ? is the time-averaged downward irradiance. However, the remaining part of the probability distribution covering all irradiance values smaller than the 90th percentile can be described with a reasonable accuracy (i.e., within 20%) with a lognormal model for all 86 measurements from the top 10 m of the ocean included in this analysis. As the intensity of irradiance fluctuations decreases with depth, the probability distribution tends toward a function symmetrical around the mean like the normal distribution. For the examined data set, the skewness and excess kurtosis assumed values very close to zero at a depth of about 10 m.

  1. Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard.

    PubMed

    Jafarzadeh, S Reza; Johnson, Wesley O; Gardner, Ian A

    2016-03-15

    The area under the receiver operating characteristic (ROC) curve (AUC) is used as a performance metric for quantitative tests. Although multiple biomarkers may be available for diagnostic or screening purposes, diagnostic accuracy is often assessed individually rather than in combination. In this paper, we consider the interesting problem of combining multiple biomarkers for use in a single diagnostic criterion with the goal of improving the diagnostic accuracy above that of an individual biomarker. The diagnostic criterion created from multiple biomarkers is based on the predictive probability of disease, conditional on given multiple biomarker outcomes. If the computed predictive probability exceeds a specified cutoff, the corresponding subject is allocated as 'diseased'. This defines a standard diagnostic criterion that has its own ROC curve, namely, the combined ROC (cROC). The AUC metric for cROC, namely, the combined AUC (cAUC), is used to compare the predictive criterion based on multiple biomarkers to one based on fewer biomarkers. A multivariate random-effects model is proposed for modeling multiple normally distributed dependent scores. Bayesian methods for estimating ROC curves and corresponding (marginal) AUCs are developed when a perfect reference standard is not available. In addition, cAUCs are computed to compare the accuracy of different combinations of biomarkers for diagnosis. The methods are evaluated using simulations and are applied to data for Johne's disease (paratuberculosis) in cattle. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Extreme climatic events change the dynamics and invasibility of semi-arid annual plant communities.

    PubMed

    Jiménez, Milagros A; Jaksic, Fabian M; Armesto, Juan J; Gaxiola, Aurora; Meserve, Peter L; Kelt, Douglas A; Gutiérrez, Julio R

    2011-12-01

    Extreme climatic events represent disturbances that change the availability of resources. We studied their effects on annual plant assemblages in a semi-arid ecosystem in north-central Chile. We analysed 130 years of precipitation data using generalised extreme-value distribution to determine extreme events, and multivariate techniques to analyse 20 years of plant cover data of 34 native and 11 exotic species. Extreme drought resets the dynamics of the system and renders it susceptible to invasion. On the other hand, by favouring native annuals, moderately wet events change species composition and allow the community to be resilient to extreme drought. The probability of extreme drought has doubled over the last 50 years. Therefore, investigations on the interaction of climate change and biological invasions are relevant to determine the potential for future effects on the dynamics of semi-arid annual plant communities. 2011 Blackwell Publishing Ltd/CNRS.

  3. Predicting the probability of slip in gait: methodology and distribution study.

    PubMed

    Gragg, Jared; Yang, James

    2016-01-01

    The likelihood of a slip is related to the available and required friction for a certain activity, here gait. Classical slip and fall analysis presumed that a walking surface was safe if the difference between the mean available and required friction coefficients exceeded a certain threshold. Previous research was dedicated to reformulating the classical slip and fall theory to include the stochastic variation of the available and required friction when predicting the probability of slip in gait. However, when predicting the probability of a slip, previous researchers have either ignored the variation in the required friction or assumed the available and required friction to be normally distributed. Also, there are no published results that actually give the probability of slip for various combinations of required and available frictions. This study proposes a modification to the equation for predicting the probability of slip, reducing the previous equation from a double-integral to a more convenient single-integral form. Also, a simple numerical integration technique is provided to predict the probability of slip in gait: the trapezoidal method. The effect of the random variable distributions on the probability of slip is also studied. It is shown that both the required and available friction distributions cannot automatically be assumed as being normally distributed. The proposed methods allow for any combination of distributions for the available and required friction, and numerical results are compared to analytical solutions for an error analysis. The trapezoidal method is shown to be highly accurate and efficient. The probability of slip is also shown to be sensitive to the input distributions of the required and available friction. Lastly, a critical value for the probability of slip is proposed based on the number of steps taken by an average person in a single day.

  4. Multivariate methods to visualise colour-space and colour discrimination data.

    PubMed

    Hastings, Gareth D; Rubin, Alan

    2015-01-01

    Despite most modern colour spaces treating colour as three-dimensional (3-D), colour data is usually not visualised in 3-D (and two-dimensional (2-D) projection-plane segments and multiple 2-D perspective views are used instead). The objectives of this article are firstly, to introduce a truly 3-D percept of colour space using stereo-pairs, secondly to view colour discrimination data using that platform, and thirdly to apply formal statistics and multivariate methods to analyse the data in 3-D. This is the first demonstration of the software that generated stereo-pairs of RGB colour space, as well as of a new computerised procedure that investigated colour discrimination by measuring colour just noticeable differences (JND). An initial pilot study and thorough investigation of instrument repeatability were performed. Thereafter, to demonstrate the capabilities of the software, five colour-normal and one colour-deficient subject were examined using the JND procedure and multivariate methods of data analysis. Scatter plots of responses were meaningfully examined in 3-D and were useful in evaluating multivariate normality as well as identifying outliers. The extent and direction of the difference between each JND response and the stimulus colour point was calculated and appreciated in 3-D. Ellipsoidal surfaces of constant probability density (distribution ellipsoids) were fitted to response data; the volumes of these ellipsoids appeared useful in differentiating the colour-deficient subject from the colour-normals. Hypothesis tests of variances and covariances showed many statistically significant differences between the results of the colour-deficient subject and those of the colour-normals, while far fewer differences were found when comparing within colour-normals. The 3-D visualisation of colour data using stereo-pairs, as well as the statistics and multivariate methods of analysis employed, were found to be unique and useful tools in the representation and study of colour. Many additional studies using these methods along with the JND and other procedures have been identified and will be reported in future publications. © 2014 The Authors Ophthalmic & Physiological Optics © 2014 The College of Optometrists.

  5. Integrated-Circuit Pseudorandom-Number Generator

    NASA Technical Reports Server (NTRS)

    Steelman, James E.; Beasley, Jeff; Aragon, Michael; Ramirez, Francisco; Summers, Kenneth L.; Knoebel, Arthur

    1992-01-01

    Integrated circuit produces 8-bit pseudorandom numbers from specified probability distribution, at rate of 10 MHz. Use of Boolean logic, circuit implements pseudorandom-number-generating algorithm. Circuit includes eight 12-bit pseudorandom-number generators, outputs are uniformly distributed. 8-bit pseudorandom numbers satisfying specified nonuniform probability distribution are generated by processing uniformly distributed outputs of eight 12-bit pseudorandom-number generators through "pipeline" of D flip-flops, comparators, and memories implementing conditional probabilities on zeros and ones.

  6. DTI segmentation by statistical surface evolution.

    PubMed

    Lenglet, Christophe; Rousson, Mikaël; Deriche, Rachid

    2006-06-01

    We address the problem of the segmentation of cerebral white matter structures from diffusion tensor images (DTI). A DTI produces, from a set of diffusion-weighted MR images, tensor-valued images where each voxel is assigned with a 3 x 3 symmetric, positive-definite matrix. This second order tensor is simply the covariance matrix of a local Gaussian process, with zero-mean, modeling the average motion of water molecules. As we will show in this paper, the definition of a dissimilarity measure and statistics between such quantities is a nontrivial task which must be tackled carefully. We claim and demonstrate that, by using the theoretically well-founded differential geometrical properties of the manifold of multivariate normal distributions, it is possible to improve the quality of the segmentation results obtained with other dissimilarity measures such as the Euclidean distance or the Kullback-Leibler divergence. The main goal of this paper is to prove that the choice of the probability metric, i.e., the dissimilarity measure, has a deep impact on the tensor statistics and, hence, on the achieved results. We introduce a variational formulation, in the level-set framework, to estimate the optimal segmentation of a DTI according to the following hypothesis: Diffusion tensors exhibit a Gaussian distribution in the different partitions. We must also respect the geometric constraints imposed by the interfaces existing among the cerebral structures and detected by the gradient of the DTI. We show how to express all the statistical quantities for the different probability metrics. We validate and compare the results obtained on various synthetic data-sets, a biological rat spinal cord phantom and human brain DTIs.

  7. Probabilities of Dilating Vesicoureteral Reflux in Children with First Time Simple Febrile Urinary Tract Infection, and Normal Renal and Bladder Ultrasound.

    PubMed

    Rianthavorn, Pornpimol; Tangngamsakul, Onjira

    2016-11-01

    We evaluated risk factors and assessed predicted probabilities for grade III or higher vesicoureteral reflux (dilating reflux) in children with a first simple febrile urinary tract infection and normal renal and bladder ultrasound. Data for 167 children 2 to 72 months old with a first febrile urinary tract infection and normal ultrasound were compared between those who had dilating vesicoureteral reflux (12 patients, 7.2%) and those who did not. Exclusion criteria consisted of history of prenatal hydronephrosis or familial reflux and complicated urinary tract infection. The logistic regression model was used to identify independent variables associated with dilating reflux. Predicted probabilities for dilating reflux were assessed. Patient age and prevalence of nonEscherichia coli bacteria were greater in children who had dilating reflux compared to those who did not (p = 0.02 and p = 0.004, respectively). Gender distribution was similar between the 2 groups (p = 0.08). In multivariate analysis older age and nonE. coli bacteria independently predicted dilating reflux, with odds ratios of 1.04 (95% CI 1.01-1.07, p = 0.02) and 3.76 (95% CI 1.05-13.39, p = 0.04), respectively. The impact of nonE. coli bacteria on predicted probabilities of dilating reflux increased with patient age. We support the concept of selective voiding cystourethrogram in children with a first simple febrile urinary tract infection and normal ultrasound. Voiding cystourethrogram should be considered in children with late onset urinary tract infection due to nonE. coli bacteria since they are at risk for dilating reflux even if the ultrasound is normal. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  8. Accounting for Selection Bias in Studies of Acute Cardiac Events.

    PubMed

    Banack, Hailey R; Harper, Sam; Kaufman, Jay S

    2018-06-01

    In cardiovascular research, pre-hospital mortality represents an important potential source of selection bias. Inverse probability of censoring weights are a method to account for this source of bias. The objective of this article is to examine and correct for the influence of selection bias due to pre-hospital mortality on the relationship between cardiovascular risk factors and all-cause mortality after an acute cardiac event. The relationship between the number of cardiovascular disease (CVD) risk factors (0-5; smoking status, diabetes, hypertension, dyslipidemia, and obesity) and all-cause mortality was examined using data from the Atherosclerosis Risk in Communities (ARIC) study. To illustrate the magnitude of selection bias, estimates from an unweighted generalized linear model with a log link and binomial distribution were compared with estimates from an inverse probability of censoring weighted model. In unweighted multivariable analyses the estimated risk ratio for mortality ranged from 1.09 (95% confidence interval [CI], 0.98-1.21) for 1 CVD risk factor to 1.95 (95% CI, 1.41-2.68) for 5 CVD risk factors. In the inverse probability of censoring weights weighted analyses, the risk ratios ranged from 1.14 (95% CI, 0.94-1.39) to 4.23 (95% CI, 2.69-6.66). Estimates from the inverse probability of censoring weighted model were substantially greater than unweighted, adjusted estimates across all risk factor categories. This shows the magnitude of selection bias due to pre-hospital mortality and effect on estimates of the effect of CVD risk factors on mortality. Moreover, the results highlight the utility of using this method to address a common form of bias in cardiovascular research. Copyright © 2018 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.

  9. Probabilistic, meso-scale flood loss modelling

    NASA Astrophysics Data System (ADS)

    Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

    2016-04-01

    Flood risk analyses are an important basis for decisions on flood risk management and adaptation. However, such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments and even more for flood loss modelling. State of the art in flood loss modelling is still the use of simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood loss models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we demonstrate and evaluate the upscaling of the approach to the meso-scale, namely on the basis of land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany (Botto et al. submitted). The application of bagging decision tree based loss models provide a probability distribution of estimated loss per municipality. Validation is undertaken on the one hand via a comparison with eight deterministic loss models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official loss data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of loss estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation approach is that it inherently provides quantitative information about the uncertainty of the prediction. References: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64. Botto A, Kreibich H, Merz B, Schröter K (submitted) Probabilistic, multi-variable flood loss modelling on the meso-scale with BT-FLEMO. Risk Analysis.

  10. Bivariate normal, conditional and rectangular probabilities: A computer program with applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.; Ashwworth, G. R.; Winter, W. R.

    1980-01-01

    Some results for the bivariate normal distribution analysis are presented. Computer programs for conditional normal probabilities, marginal probabilities, as well as joint probabilities for rectangular regions are given: routines for computing fractile points and distribution functions are also presented. Some examples from a closed circuit television experiment are included.

  11. Assessment of source probabilities for potential tsunamis affecting the U.S. Atlantic coast

    USGS Publications Warehouse

    Geist, E.L.; Parsons, T.

    2009-01-01

    Estimating the likelihood of tsunamis occurring along the U.S. Atlantic coast critically depends on knowledge of tsunami source probability. We review available information on both earthquake and landslide probabilities from potential sources that could generate local and transoceanic tsunamis. Estimating source probability includes defining both size and recurrence distributions for earthquakes and landslides. For the former distribution, source sizes are often distributed according to a truncated or tapered power-law relationship. For the latter distribution, sources are often assumed to occur in time according to a Poisson process, simplifying the way tsunami probabilities from individual sources can be aggregated. For the U.S. Atlantic coast, earthquake tsunami sources primarily occur at transoceanic distances along plate boundary faults. Probabilities for these sources are constrained from previous statistical studies of global seismicity for similar plate boundary types. In contrast, there is presently little information constraining landslide probabilities that may generate local tsunamis. Though there is significant uncertainty in tsunami source probabilities for the Atlantic, results from this study yield a comparative analysis of tsunami source recurrence rates that can form the basis for future probabilistic analyses.

  12. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

    In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less

  14. Effect of Climate Change on Mediterranean Winter Ranges of Two Migratory Passerines.

    PubMed

    Tellería, José L; Fernández-López, Javier; Fandos, Guillermo

    2016-01-01

    We studied the effect of climate change on the distribution of two insectivorous passerines (the meadow pipit Anthus pratensis and the chiffchaff Phylloscopus collybita) in wintering grounds of the Western Mediterranean basin. In this region, precipitation and temperature can affect the distribution of these birds through direct (thermoregulation costs) or indirect effects (primary productivity). Thus, it can be postulated that projected climate changes in the region will affect the extent and suitability of their wintering grounds. We studied pipit and chiffchaff abundance in several hundred localities along a belt crossing Spain and Morocco and assessed the effects of climate and other geographical and habitat predictors on bird distribution. Multivariate analyses reported a positive effect of temperature on the present distribution of the two species, with an additional effect of precipitation on the meadow pipit. These climate variables were used with Maxent to model the occurrence probabilities of species using ring recoveries as presence data. Abundance and occupancy of the two species in the study localities adjusted to the distribution models, with more birds in sectors of high climate suitability. After validation, these models were used to forecast the distribution of climate suitability according to climate projections for 2050-2070 (temperature increase and precipitation reduction). Results show an expansion of climatically suitable sectors into the highlands by the effect of warming on the two species, and a retreat of the meadow pipit from southern sectors related to rain reduction. The predicted patterns show a mean increase in climate suitability for the two species due to the warming of the large highland expanses typical of the western Mediterranean.

  15. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  16. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.

  17. Multivariate Bayesian analysis of Gaussian, right censored Gaussian, ordered categorical and binary traits using Gibbs sampling

    PubMed Central

    Korsgaard, Inge Riis; Lund, Mogens Sandø; Sorensen, Daniel; Gianola, Daniel; Madsen, Per; Jensen, Just

    2003-01-01

    A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories) or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed. PMID:12633531

  18. Ubiquity of Benford's law and emergence of the reciprocal distribution

    DOE PAGES

    Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

    2016-04-07

    In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less

  19. Hospital of diagnosis and probability of having surgical treatment for resectable gastric cancer.

    PubMed

    van Putten, M; Verhoeven, R H A; van Sandick, J W; Plukker, J T M; Lemmens, V E P P; Wijnhoven, B P L; Nieuwenhuijzen, G A P

    2016-02-01

    Gastric cancer surgery is increasingly being centralized in the Netherlands, whereas the diagnosis is often made in hospitals where gastric cancer surgery is not performed. The aim of this study was to assess whether hospital of diagnosis affects the probability of undergoing surgery and its impact on overall survival. All patients with potentially curable gastric cancer according to stage (cT1/1b-4a, cN0-2, cM0) diagnosed between 2005 and 2013 were selected from The Netherlands Cancer Registry. Multilevel logistic regression was used to examine the probability of undergoing surgery according to hospital of diagnosis. The effect of variation in probability of undergoing surgery among hospitals of diagnosis on overall survival during the intervals 2005-2009 and 2010-2013 was examined by using Cox regression analysis. A total of 5620 patients with potentially curable gastric cancer, diagnosed in 91 hospitals, were included. The proportion of patients who underwent surgery ranged from 53.1 to 83.9 per cent according to hospital of diagnosis (P < 0.001); after multivariable adjustment for patient and tumour characteristics it ranged from 57.0 to 78.2 per cent (P < 0.001). Multivariable Cox regression showed that patients diagnosed between 2010 and 2013 in hospitals with a low probability of patients undergoing curative treatment had worse overall survival (hazard ratio 1.21; P < 0.001). The large variation in probability of receiving surgery for gastric cancer between hospitals of diagnosis and its impact on overall survival indicates that gastric cancer decision-making is suboptimal. © 2015 BJS Society Ltd Published by John Wiley & Sons Ltd.

  20. Media exposure related to the 2008 Sichuan Earthquake predicted probable PTSD among Chinese adolescents in Kunming, China: A longitudinal study.

    PubMed

    Yeung, Nelson C Y; Lau, Joseph T F; Yu, Nancy Xiaonan; Zhang, Jianping; Xu, Zhening; Choi, Kai Chow; Zhang, Qi; Mak, Winnie W S; Lui, Wacy W S

    2018-03-01

    This study examined the prevalence and the psychosocial predictors of probable PTSD among Chinese adolescents in Kunming (approximately 444 miles from the epicenter), China, who were indirectly exposed to the Sichuan Earthquake in 2008. Using a longitudinal study design, primary and secondary school students (N = 3577) in Kunming completed questionnaires at baseline (June 2008) and 6 months afterward (December 2008) in classroom settings. Participants' exposure to earthquake-related imagery and content, perceptions and emotional reactions related to the earthquake, and posttraumatic stress symptoms were measured. Univariate and forward stepwise multivariable logistic regression models were fit to identify significant predictors of probable PTSD at the 6-month follow-up. Prevalences of probable PTSD (with a Children's Revised Impact of Event Scale score ≥30) among the participants at baseline and 6-month follow-up were 16.9% and 11.1% respectively. In the multivariable analysis, those who were frequently exposed to distressful imagery had experienced at least two types of negative life events, perceived that teachers were distressed due to the earthquake, believed that the earthquake resulted from damages to the ecosystem, and felt apprehensive and emotionally disturbed due to the earthquake reported a higher risk of probable PTSD at 6-month follow-up (all ps < .05). Exposure to distressful media images, emotional responses, and disaster-related perceptions at baseline were found to be predictive of probable PTSD several months after indirect exposure to the event. Parents, teachers, and the mass media should be aware of the negative impacts of disaster-related media exposure on adolescents' psychological health. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  1. Spatial Probability Distribution of Strata's Lithofacies and its Impacts on Land Subsidence in Huairou Emergency Water Resources Region of Beijing

    NASA Astrophysics Data System (ADS)

    Li, Y.; Gong, H.; Zhu, L.; Guo, L.; Gao, M.; Zhou, C.

    2016-12-01

    Continuous over-exploitation of groundwater causes dramatic drawdown, and leads to regional land subsidence in the Huairou Emergency Water Resources region, which is located in the up-middle part of the Chaobai river basin of Beijing. Owing to the spatial heterogeneity of strata's lithofacies of the alluvial fan, ground deformation has no significant positive correlation with groundwater drawdown, and one of the challenges ahead is to quantify the spatial distribution of strata's lithofacies. The transition probability geostatistics approach provides potential for characterizing the distribution of heterogeneous lithofacies in the subsurface. Combined the thickness of clay layer extracted from the simulation, with deformation field acquired from PS-InSAR technology, the influence of strata's lithofacies on land subsidence can be analyzed quantitatively. The strata's lithofacies derived from borehole data were generalized into four categories and their probability distribution in the observe space was mined by using the transition probability geostatistics, of which clay was the predominant compressible material. Geologically plausible realizations of lithofacies distribution were produced, accounting for complex heterogeneity in alluvial plain. At a particular probability level of more than 40 percent, the volume of clay defined was 55 percent of the total volume of strata's lithofacies. This level, equaling nearly the volume of compressible clay derived from the geostatistics, was thus chosen to represent the boundary between compressible and uncompressible material. The method incorporates statistical geological information, such as distribution proportions, average lengths and juxtaposition tendencies of geological types, mainly derived from borehole data and expert knowledge, into the Markov chain model of transition probability. Some similarities of patterns were indicated between the spatial distribution of deformation field and clay layer. In the area with roughly similar water table decline, locations in the subsurface having a higher probability for the existence of compressible material occur more than that in the location with a lower probability. Such estimate of spatial probability distribution is useful to analyze the uncertainty of land subsidence.

  2. The exact probability distribution of the rank product statistics for replicated experiments.

    PubMed

    Eisinga, Rob; Breitling, Rainer; Heskes, Tom

    2013-03-18

    The rank product method is a widely accepted technique for detecting differentially regulated genes in replicated microarray experiments. To approximate the sampling distribution of the rank product statistic, the original publication proposed a permutation approach, whereas recently an alternative approximation based on the continuous gamma distribution was suggested. However, both approximations are imperfect for estimating small tail probabilities. In this paper we relate the rank product statistic to number theory and provide a derivation of its exact probability distribution and the true tail probabilities. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  3. Modeling the probability distribution of peak discharge for infiltrating hillslopes

    NASA Astrophysics Data System (ADS)

    Baiamonte, Giorgio; Singh, Vijay P.

    2017-07-01

    Hillslope response plays a fundamental role in the prediction of peak discharge at the basin outlet. The peak discharge for the critical duration of rainfall and its probability distribution are needed for designing urban infrastructure facilities. This study derives the probability distribution, denoted as GABS model, by coupling three models: (1) the Green-Ampt model for computing infiltration, (2) the kinematic wave model for computing discharge hydrograph from the hillslope, and (3) the intensity-duration-frequency (IDF) model for computing design rainfall intensity. The Hortonian mechanism for runoff generation is employed for computing the surface runoff hydrograph. Since the antecedent soil moisture condition (ASMC) significantly affects the rate of infiltration, its effect on the probability distribution of peak discharge is investigated. Application to a watershed in Sicily, Italy, shows that with the increase of probability, the expected effect of ASMC to increase the maximum discharge diminishes. Only for low values of probability, the critical duration of rainfall is influenced by ASMC, whereas its effect on the peak discharge seems to be less for any probability. For a set of parameters, the derived probability distribution of peak discharge seems to be fitted by the gamma distribution well. Finally, an application to a small watershed, with the aim to test the possibility to arrange in advance the rational runoff coefficient tables to be used for the rational method, and a comparison between peak discharges obtained by the GABS model with those measured in an experimental flume for a loamy-sand soil were carried out.

  4. Measuring firm size distribution with semi-nonparametric densities

    NASA Astrophysics Data System (ADS)

    Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier

    2017-11-01

    In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.

  5. Constructing inverse probability weights for continuous exposures: a comparison of methods.

    PubMed

    Naimi, Ashley I; Moodie, Erica E M; Auger, Nathalie; Kaufman, Jay S

    2014-03-01

    Inverse probability-weighted marginal structural models with binary exposures are common in epidemiology. Constructing inverse probability weights for a continuous exposure can be complicated by the presence of outliers, and the need to identify a parametric form for the exposure and account for nonconstant exposure variance. We explored the performance of various methods to construct inverse probability weights for continuous exposures using Monte Carlo simulation. We generated two continuous exposures and binary outcomes using data sampled from a large empirical cohort. The first exposure followed a normal distribution with homoscedastic variance. The second exposure followed a contaminated Poisson distribution, with heteroscedastic variance equal to the conditional mean. We assessed six methods to construct inverse probability weights using: a normal distribution, a normal distribution with heteroscedastic variance, a truncated normal distribution with heteroscedastic variance, a gamma distribution, a t distribution (1, 3, and 5 degrees of freedom), and a quantile binning approach (based on 10, 15, and 20 exposure categories). We estimated the marginal odds ratio for a single-unit increase in each simulated exposure in a regression model weighted by the inverse probability weights constructed using each approach, and then computed the bias and mean squared error for each method. For the homoscedastic exposure, the standard normal, gamma, and quantile binning approaches performed best. For the heteroscedastic exposure, the quantile binning, gamma, and heteroscedastic normal approaches performed best. Our results suggest that the quantile binning approach is a simple and versatile way to construct inverse probability weights for continuous exposures.

  6. Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction

    NASA Technical Reports Server (NTRS)

    Cohen, A. C.

    1971-01-01

    A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.

  7. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution

    PubMed Central

    Lo, Kenneth

    2011-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375

  8. Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.

    PubMed

    Lo, Kenneth; Gottardo, Raphael

    2012-01-01

    Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.

  9. Spatial distribution of environmental risk associated to a uranium abandoned mine (Central Portugal)

    NASA Astrophysics Data System (ADS)

    Antunes, I. M.; Ribeiro, A. F.

    2012-04-01

    The abandoned uranium mine of Canto do Lagar is located at Arcozelo da Serra, central Portugal. The mine was exploited in an open pit and produced about 12430Kg of uranium oxide (U3O8), between 1987 and 1988. The dominant geological unit is the porphyritic coarse-grained two-mica granite, with biotite>muscovite. The uranium deposit consists of two gaps crushing, parallel to the coarse-grained porphyritic granite, with average direction N30°E, silicified, sericitized and reddish jasperized, with a width of approximately 10 meters. These gaps are accompanied by two thin veins of white quartz, 70°-80° WNW, ferruginous and jasperized with chalcedony, red jasper and opal. These veins are about 6 meters away from each other. They contain secondary U-phosphates phases such as autunite and torbernite. Rejected materials (1000000ton) were deposited on two dumps and a lake was formed in the open pit. To assess the environmental risk of the abandoned uranium mine of Canto do Lagar, were collected and analysed 70 samples on stream sediments, soils and mine tailings materials. The relation between samples composition were tested using the Principal Components Analysis (PCA) (multivariate analysis) and spatial distribution using Kriging Indicator. The spatial distribution of stream sediments shows that the probability of expression for principal component 1 (explaining Y, Zr, Nb, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Hf, Th and U contents), decreases along SE-NW direction. This component is explained by the samples located inside mine influence. The probability of expression for principal component 2 (explaining Be, Na, Al, Si, P, K, Ca, Ti, Mn, Fe, Co, Ni, Cu, As, Rb, Sr, Mo, Cs, Ba, Tl and Bi contents), increases to middle stream line. This component is explained by the samples located outside mine influence. The spatial distribution of soils, shows that the probability of expression for principal component 1 (explaining Mg, P, Ca, Ge, Sr, Y, Zr, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Hf, W, Th and U contents) decreases along SE direction and increases along NE and SW directions. The probability of expression for principal component 2 (explaining pH, K, Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, Sr and Pb contents), decreases from central points (inside mine influence) to peripheral points (outside mine influence) and gradually increases along N and SW directions. The spatial distribution of tailing materials did not allowed a consistent spatial distribution. In general, the stream sediments are classified as unpolluted and not polluted or moderately polluted, according to geoaccumulation Müller index with exception of local samples, located inside mine influence. The soils cannot be used for public, private or residential uses according to the Canadian soil legislation. However, almost samples can be used for commercial or industrial occupation. According to the target values and intervention values for soils remediation, these soils need intervention. Tailing materials samples are much polluted in thorium (Th) and uranium (U) and they cannot be used for public, private or residential uses.

  10. Evaluation of probabilistic forecasts with the scoringRules package

    NASA Astrophysics Data System (ADS)

    Jordan, Alexander; Krüger, Fabian; Lerch, Sebastian

    2017-04-01

    Over the last decades probabilistic forecasts in the form of predictive distributions have become popular in many scientific disciplines. With the proliferation of probabilistic models arises the need for decision-theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way in order to better understand sources of prediction errors and to improve the models. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. In coherence with decision-theoretical principles they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This contribution presents the software package scoringRules for the statistical programming language R, which provides functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. For univariate variables, two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, ensemble weather forecasts take this form. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices. Recent developments include the addition of scoring rules to evaluate multivariate forecast distributions. The use of the scoringRules package is illustrated in an example on post-processing ensemble forecasts of temperature.

  11. Combining p-values in replicated single-case experiments with multivariate outcome.

    PubMed

    Solmi, Francesca; Onghena, Patrick

    2014-01-01

    Interest in combining probabilities has a long history in the global statistical community. The first steps in this direction were taken by Ronald Fisher, who introduced the idea of combining p-values of independent tests to provide a global decision rule when multiple aspects of a given problem were of interest. An interesting approach to this idea of combining p-values is the one based on permutation theory. The methods belonging to this particular approach exploit the permutation distributions of the tests to be combined, and use a simple function to combine probabilities. Combining p-values finds a very interesting application in the analysis of replicated single-case experiments. In this field the focus, while comparing different treatments effects, is more articulated than when just looking at the means of the different populations. Moreover, it is often of interest to combine the results obtained on the single patients in order to get more global information about the phenomenon under study. This paper gives an overview of how the concept of combining p-values was conceived, and how it can be easily handled via permutation techniques. Finally, the method of combining p-values is applied to a simulated replicated single-case experiment, and a numerical illustration is presented.

  12. Stochastic and Statistical Analysis of Utility Revenues and Weather Data Analysis for Consumer Demand Estimation in Smart Grids

    PubMed Central

    Ali, S. M.; Mehmood, C. A; Khan, B.; Jawad, M.; Farid, U; Jadoon, J. K.; Ali, M.; Tareen, N. K.; Usman, S.; Majid, M.; Anwar, S. M.

    2016-01-01

    In smart grid paradigm, the consumer demands are random and time-dependent, owning towards stochastic probabilities. The stochastically varying consumer demands have put the policy makers and supplying agencies in a demanding position for optimal generation management. The utility revenue functions are highly dependent on the consumer deterministic stochastic demand models. The sudden drifts in weather parameters effects the living standards of the consumers that in turn influence the power demands. Considering above, we analyzed stochastically and statistically the effect of random consumer demands on the fixed and variable revenues of the electrical utilities. Our work presented the Multi-Variate Gaussian Distribution Function (MVGDF) probabilistic model of the utility revenues with time-dependent consumer random demands. Moreover, the Gaussian probabilities outcome of the utility revenues is based on the varying consumer n demands data-pattern. Furthermore, Standard Monte Carlo (SMC) simulations are performed that validated the factor of accuracy in the aforesaid probabilistic demand-revenue model. We critically analyzed the effect of weather data parameters on consumer demands using correlation and multi-linear regression schemes. The statistical analysis of consumer demands provided a relationship between dependent (demand) and independent variables (weather data) for utility load management, generation control, and network expansion. PMID:27314229

  13. From forager tracks to prey distributions: an application to tuna vessel monitoring systems (VMS).

    PubMed

    Walker, Emily; Rivoirard, Jacques; Gaspar, Philippe; Bez, Nicolas

    2015-04-01

    In the open ocean, movements of migratory fish populations are typically surveyed using tagging methods that are subject to low sample sizes for archive tags, except for a few notable examples, and poor temporal resolution for conventional tags. Alternatively, one can infer patterns of movement of migratory fish by tracking movements of their predators, i.e., fishing vessels, whose navigational systems (e.g., GPS) provide accurate and frequent VMS (vessel monitoring system) records of movement in pursuit of prey. In this paper, we develop a state-space model that infers the foraging activities of fishing vessels from their tracks. Second, we link foraging activities to probabilities of tuna presence. Finally, using multivariate geostatistical interpolation (cokriging) we map the probability of tuna presence together with their estimation variances and produce a time series of indices of abundance. While the segmentation of the trajectories is validated by observers' data, the present VMS-index is compared to catch rate and proved to be useful for management perspectives. The approach reported in this manuscript extends beyond the case study considered. It can be applied to any foragers that engage in an attempt of capture when they see prey and for whom this attempt is linked to a tractable change in behavior.

  14. Stochastic and Statistical Analysis of Utility Revenues and Weather Data Analysis for Consumer Demand Estimation in Smart Grids.

    PubMed

    Ali, S M; Mehmood, C A; Khan, B; Jawad, M; Farid, U; Jadoon, J K; Ali, M; Tareen, N K; Usman, S; Majid, M; Anwar, S M

    2016-01-01

    In smart grid paradigm, the consumer demands are random and time-dependent, owning towards stochastic probabilities. The stochastically varying consumer demands have put the policy makers and supplying agencies in a demanding position for optimal generation management. The utility revenue functions are highly dependent on the consumer deterministic stochastic demand models. The sudden drifts in weather parameters effects the living standards of the consumers that in turn influence the power demands. Considering above, we analyzed stochastically and statistically the effect of random consumer demands on the fixed and variable revenues of the electrical utilities. Our work presented the Multi-Variate Gaussian Distribution Function (MVGDF) probabilistic model of the utility revenues with time-dependent consumer random demands. Moreover, the Gaussian probabilities outcome of the utility revenues is based on the varying consumer n demands data-pattern. Furthermore, Standard Monte Carlo (SMC) simulations are performed that validated the factor of accuracy in the aforesaid probabilistic demand-revenue model. We critically analyzed the effect of weather data parameters on consumer demands using correlation and multi-linear regression schemes. The statistical analysis of consumer demands provided a relationship between dependent (demand) and independent variables (weather data) for utility load management, generation control, and network expansion.

  15. Delirium superimposed on dementia: defining disease states and course from longitudinal measurements of a multivariate index using latent class analysis and hidden Markov chains.

    PubMed

    Ciampi, Antonio; Dyachenko, Alina; Cole, Martin; McCusker, Jane

    2011-12-01

    The study of mental disorders in the elderly presents substantial challenges due to population heterogeneity, coexistence of different mental disorders, and diagnostic uncertainty. While reliable tools have been developed to collect relevant data, new approaches to study design and analysis are needed. We focus on a new analytic approach. Our framework is based on latent class analysis and hidden Markov chains. From repeated measurements of a multivariate disease index, we extract the notion of underlying state of a patient at a time point. The course of the disorder is then a sequence of transitions among states. States and transitions are not observable; however, the probability of being in a state at a time point, and the transition probabilities from one state to another over time can be estimated. Data from 444 patients with and without diagnosis of delirium and dementia were available from a previous study. The Delirium Index was measured at diagnosis, and at 2 and 6 months from diagnosis. Four latent classes were identified: fairly healthy, moderately ill, clearly sick, and very sick. Dementia and delirium could not be separated on the basis of these data alone. Indeed, as the probability of delirium increased, so did the probability of decline of mental functions. Eight most probable courses were identified, including good and poor stable courses, and courses exhibiting various patterns of improvement. Latent class analysis and hidden Markov chains offer a promising tool for studying mental disorders in the elderly. Its use may show its full potential as new data become available.

  16. Probabilistic flood damage modelling at the meso-scale

    NASA Astrophysics Data System (ADS)

    Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

    2014-05-01

    Decisions on flood risk management and adaptation are usually based on risk analyses. Such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments. Most damage models have in common that complex damaging processes are described by simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood damage models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we show how the model BT-FLEMO (Bagging decision Tree based Flood Loss Estimation MOdel) can be applied on the meso-scale, namely on the basis of ATKIS land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany. The application of BT-FLEMO provides a probability distribution of estimated damage to residential buildings per municipality. Validation is undertaken on the one hand via a comparison with eight other damage models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official damage data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of damage estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation model BT-FLEMO is that it inherently provides quantitative information about the uncertainty of the prediction. Reference: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64.

  17. An Analytic Solution to the Computation of Power and Sample Size for Genetic Association Studies under a Pleiotropic Mode of Inheritance.

    PubMed

    Gordon, Derek; Londono, Douglas; Patel, Payal; Kim, Wonkuk; Finch, Stephen J; Heiman, Gary A

    2016-01-01

    Our motivation here is to calculate the power of 3 statistical tests used when there are genetic traits that operate under a pleiotropic mode of inheritance and when qualitative phenotypes are defined by use of thresholds for the multiple quantitative phenotypes. Specifically, we formulate a multivariate function that provides the probability that an individual has a vector of specific quantitative trait values conditional on having a risk locus genotype, and we apply thresholds to define qualitative phenotypes (affected, unaffected) and compute penetrances and conditional genotype frequencies based on the multivariate function. We extend the analytic power and minimum-sample-size-necessary (MSSN) formulas for 2 categorical data-based tests (genotype, linear trend test [LTT]) of genetic association to the pleiotropic model. We further compare the MSSN of the genotype test and the LTT with that of a multivariate ANOVA (Pillai). We approximate the MSSN for statistics by linear models using a factorial design and ANOVA. With ANOVA decomposition, we determine which factors most significantly change the power/MSSN for all statistics. Finally, we determine which test statistics have the smallest MSSN. In this work, MSSN calculations are for 2 traits (bivariate distributions) only (for illustrative purposes). We note that the calculations may be extended to address any number of traits. Our key findings are that the genotype test usually has lower MSSN requirements than the LTT. More inclusive thresholds (top/bottom 25% vs. top/bottom 10%) have higher sample size requirements. The Pillai test has a much larger MSSN than both the genotype test and the LTT, as a result of sample selection. With these formulas, researchers can specify how many subjects they must collect to localize genes for pleiotropic phenotypes. © 2017 S. Karger AG, Basel.

  18. Comparison of Deterministic and Probabilistic Radial Distribution Systems Load Flow

    NASA Astrophysics Data System (ADS)

    Gupta, Atma Ram; Kumar, Ashwani

    2017-12-01

    Distribution system network today is facing the challenge of meeting increased load demands from the industrial, commercial and residential sectors. The pattern of load is highly dependent on consumer behavior and temporal factors such as season of the year, day of the week or time of the day. For deterministic radial distribution load flow studies load is taken as constant. But, load varies continually with a high degree of uncertainty. So, there is a need to model probable realistic load. Monte-Carlo Simulation is used to model the probable realistic load by generating random values of active and reactive power load from the mean and standard deviation of the load and for solving a Deterministic Radial Load Flow with these values. The probabilistic solution is reconstructed from deterministic data obtained for each simulation. The main contribution of the work is: Finding impact of probable realistic ZIP load modeling on balanced radial distribution load flow. Finding impact of probable realistic ZIP load modeling on unbalanced radial distribution load flow. Compare the voltage profile and losses with probable realistic ZIP load modeling for balanced and unbalanced radial distribution load flow.

  19. Probability Distribution of Turbulent Kinetic Energy Dissipation Rate in Ocean: Observations and Approximations

    NASA Astrophysics Data System (ADS)

    Lozovatsky, I.; Fernando, H. J. S.; Planella-Morato, J.; Liu, Zhiyu; Lee, J.-H.; Jinadasa, S. U. P.

    2017-10-01

    The probability distribution of turbulent kinetic energy dissipation rate in stratified ocean usually deviates from the classic lognormal distribution that has been formulated for and often observed in unstratified homogeneous layers of atmospheric and oceanic turbulence. Our measurements of vertical profiles of micro-scale shear, collected in the East China Sea, northern Bay of Bengal, to the south and east of Sri Lanka, and in the Gulf Stream region, show that the probability distributions of the dissipation rate ɛ˜r in the pycnoclines (r ˜ 1.4 m is the averaging scale) can be successfully modeled by the Burr (type XII) probability distribution. In weakly stratified boundary layers, lognormal distribution of ɛ˜r is preferable, although the Burr is an acceptable alternative. The skewness Skɛ and the kurtosis Kɛ of the dissipation rate appear to be well correlated in a wide range of Skɛ and Kɛ variability.

  20. Extensions to Multivariate Space Time Mixture Modeling of Small Area Cancer Data.

    PubMed

    Carroll, Rachel; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Aregay, Mehreteab; Watjou, Kevin

    2017-05-09

    Oral cavity and pharynx cancer, even when considered together, is a fairly rare disease. Implementation of multivariate modeling with lung and bronchus cancer, as well as melanoma cancer of the skin, could lead to better inference for oral cavity and pharynx cancer. The multivariate structure of these models is accomplished via the use of shared random effects, as well as other multivariate prior distributions. The results in this paper indicate that care should be taken when executing these types of models, and that multivariate mixture models may not always be the ideal option, depending on the data of interest.

  1. EXTENDING MULTIVARIATE DISTANCE MATRIX REGRESSION WITH AN EFFECT SIZE MEASURE AND THE ASYMPTOTIC NULL DISTRIBUTION OF THE TEST STATISTIC

    PubMed Central

    McArtor, Daniel B.; Lubke, Gitta H.; Bergeman, C. S.

    2017-01-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains. PMID:27738957

  2. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    PubMed

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  3. Heterogeneity Coefficients for Mahalanobis' D as a Multivariate Effect Size.

    PubMed

    Del Giudice, Marco

    2017-01-01

    The Mahalanobis distance D is the multivariate generalization of Cohen's d and can be used as a standardized effect size for multivariate differences between groups. An important issue in the interpretation of D is heterogeneity, that is, the extent to which contributions to the overall effect size are concentrated in a small subset of variables rather than evenly distributed across the whole set. Here I present two heterogeneity coefficients for D based on the Gini coefficient, a well-known index of inequality among values of a distribution. I discuss the properties and limitations of the two coefficients and illustrate their use by reanalyzing some published findings from studies of gender differences.

  4. Evaluation of the Three Parameter Weibull Distribution Function for Predicting Fracture Probability in Composite Materials

    DTIC Science & Technology

    1978-03-01

    for the risk of rupture for a unidirectionally laminat - ed composite subjected to pure bending. (5D This equation can be simplified further by use of...C EVALUATION OF THE THREE PARAMETER WEIBULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS. THESIS / AFIT/GAE...EVALUATION OF THE THREE PARAMETER WE1BULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS THESIS Presented

  5. Prediction of road accidents: A Bayesian hierarchical approach.

    PubMed

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H

    2013-03-01

    In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any road network provided that the required data are available. Copyright © 2012 Elsevier Ltd. All rights reserved.

  6. An assessment on the use of bivariate, multivariate and soft computing techniques for collapse susceptibility in GIS environ

    NASA Astrophysics Data System (ADS)

    Yilmaz, Işik; Marschalko, Marian; Bednarik, Martin

    2013-04-01

    The paper presented herein compares and discusses the use of bivariate, multivariate and soft computing techniques for collapse susceptibility modelling. Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index (TWI), stream power index (SPI), Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from the models, and they were then compared by means of their validations. However, Area Under Curve (AUC) values obtained from all three models showed that the map obtained from soft computing (ANN) model looks like more accurate than the other models, accuracies of all three models can be evaluated relatively similar. The results also showed that the conditional probability is an essential method in preparation of collapse susceptibility map and highly compatible with GIS operating features.

  7. Biomechanical, psychosocial and individual risk factors predicting low back functional impairment among furniture distribution employees

    PubMed Central

    Ferguson, Sue A.; Allread, W. Gary; Burr, Deborah L.; Heaney, Catherine; Marras, William S.

    2013-01-01

    Background Biomechanical, psychosocial and individual risk factors for low back disorder have been studied extensively however few researchers have examined all three risk factors. The objective of this was to develop a low back disorder risk model in furniture distribution workers using biomechanical, psychosocial and individual risk factors. Methods This was a prospective study with a six month follow-up time. There were 454 subjects at 9 furniture distribution facilities enrolled in the study. Biomechanical exposure was evaluated using the American Conference of Governmental Industrial Hygienists (2001) lifting threshold limit values for low back injury risk. Psychosocial and individual risk factors were evaluated via questionnaires. Low back health functional status was measured using the lumbar motion monitor. Low back disorder cases were defined as a loss of low back functional performance of −0.14 or more. Findings There were 92 cases of meaningful loss in low back functional performance and 185 non cases. A multivariate logistic regression model included baseline functional performance probability, facility, perceived workload, intermediated reach distance number of exertions above threshold limit values, job tenure manual material handling, and age combined to provide a model sensitivity of 68.5% and specificity of 71.9%. Interpretation: The results of this study indicate which biomechanical, individual and psychosocial risk factors are important as well as how much of each risk factor is too much resulting in increased risk of low back disorder among furniture distribution workers. PMID:21955915

  8. Concentration distribution and assessment of several heavy metals in sediments of west-four Pearl River Estuary

    NASA Astrophysics Data System (ADS)

    Wang, Shanshan; Cao, Zhimin; Lan, Dongzhao; Zheng, Zhichang; Li, Guihai

    2008-09-01

    Grain size parameters, trace metals (Co, Cu, Ni, Pb, Cr, Zn, Ba, Zr and Sr) and total organic matter (TOM) of 38 surficial sediments and a sediment core of west-four Pearl River Estuary region were analyzed. The spacial distribution and the transportation procession of the chemical element in surficial sediments were studied mainly. Multivariate statistics are used to analyses the interrelationship of metal elements, TOM and the grain size parameters. The results demonstrated that terrigenous sediment taken by the rivers are main sources of the trace metal elements and TOM, and the lithology of parent material is a dominating factor controlling the trace metal composition in the surficial sediment. In addition, the hydrodynamic condition and landform are the dominating factors controlling the large-scale distribution, while the anthropogenic input in the coastal area alters the regional distribution of heavy metal elements Co, Cu, Ni, Pb, Cr and Zn. The enrichment factor (EF) analysis was used for the differentiation of the metal source between anthropogenic and naturally occurring, and for the assessment of the anthropogenic influence, the deeper layer content of heavy metals were calculated as the background values and Zr was chosen as the reference element for Co, Cu, Ni, Pb, Cr and Zn. The result indicate prevalent enrichment of Co, Cu, Ni, Pb and Cr, and the contamination of Pb is most obvious, further more, the peculiar high EF value sites of Zn and Pb probably suggest point source input.

  9. Bivariate extreme value distributions

    NASA Technical Reports Server (NTRS)

    Elshamy, M.

    1992-01-01

    In certain engineering applications, such as those occurring in the analyses of ascent structural loads for the Space Transportation System (STS), some of the load variables have a lower bound of zero. Thus, the need for practical models of bivariate extreme value probability distribution functions with lower limits was identified. We discuss the Gumbel models and present practical forms of bivariate extreme probability distributions of Weibull and Frechet types with two parameters. Bivariate extreme value probability distribution functions can be expressed in terms of the marginal extremel distributions and a 'dependence' function subject to certain analytical conditions. Properties of such bivariate extreme distributions, sums and differences of paired extremals, as well as the corresponding forms of conditional distributions, are discussed. Practical estimation techniques are also given.

  10. A parametric multivariate drought index and its application in the attribution and projection of flash drought change in China

    NASA Astrophysics Data System (ADS)

    Yuan, X.; Wang, L.; Zhang, M.

    2017-12-01

    Rainfall deficit in the crop growing seasons is usually accompanied by heat waves. Abnormally high temperature increases evapotranspiration and decreases soil moisture rapidly, and ultimately results in a type of drought with a rapid onset, short duration but devastating impact, which is called "Flash drought". With the increase in global temperature, flash drought is expected to occur more frequently. However, there is no consensus on the definition of flash drought so far. Moreover, large uncertainty exists in the estimation of the flash drought and its trend, and the underlying mechanism for its long-term change is not clear. In this presentation, a parametric multivariate drought index that characterizes the joint probability distribution of key variables of flash drought will be developed, and the historical changes in flash drought over China will be analyzed. In addition, a set of land surface model simulations driven by IPCC CMIP5 models with different forcings and future scenarios, will be used for the detection and attribution of flash drought change. This study is targeted at quantifying the influences of natural and anthropogenic climate change on the flash drought change, projecting its future change as well as the corresponding uncertainty, and improving our understanding of the variation of flash drought and its underlying mechanism in a changing climate.

  11. Gaussianization for fast and accurate inference from cosmological data

    NASA Astrophysics Data System (ADS)

    Schuhmann, Robert L.; Joachimi, Benjamin; Peiris, Hiranya V.

    2016-06-01

    We present a method to transform multivariate unimodal non-Gaussian posterior probability densities into approximately Gaussian ones via non-linear mappings, such as Box-Cox transformations and generalizations thereof. This permits an analytical reconstruction of the posterior from a point sample, like a Markov chain, and simplifies the subsequent joint analysis with other experiments. This way, a multivariate posterior density can be reported efficiently, by compressing the information contained in Markov Chain Monte Carlo samples. Further, the model evidence integral (I.e. the marginal likelihood) can be computed analytically. This method is analogous to the search for normal parameters in the cosmic microwave background, but is more general. The search for the optimally Gaussianizing transformation is performed computationally through a maximum-likelihood formalism; its quality can be judged by how well the credible regions of the posterior are reproduced. We demonstrate that our method outperforms kernel density estimates in this objective. Further, we select marginal posterior samples from Planck data with several distinct strongly non-Gaussian features, and verify the reproduction of the marginal contours. To demonstrate evidence computation, we Gaussianize the joint distribution of data from weak lensing and baryon acoustic oscillations, for different cosmological models, and find a preference for flat Λcold dark matter. Comparing to values computed with the Savage-Dickey density ratio, and Population Monte Carlo, we find good agreement of our method within the spread of the other two.

  12. Organic markers as source discriminants and sediment transport indicators in south San Francisco Bay, California

    USGS Publications Warehouse

    Hostettler, F.D.; Rapp, J.B.; Kvenvolden, K.A.; Samuel, N L.

    1989-01-01

    Sediment samples from nearshore sites in south San Francisco Bay and from streams flowing into that section of the Bay have been characterized in terms of their content of biogenic and anthropogenic molecular marker compounds. The distributions, input sources, and applicability of these compounds in determining sediment movement are discussed. By means of inspection and multivariate analysis, the compounds were grouped according to probable input sources and the sampling stations according to the relative importance of source contributions. A suite of polycyclic aromatic hydrocarbons (PAHs) dominated by pyrene, fluoranthene and phenanthrene, typical of estuarine environments worldwide, and suites of mature sterane and hopane biomarkers were found to be most suitable as background markers for the Bay. A homologous series of long-chain n-aldehydes (C12-C32) with a strong even-over-odd carbon number dominance in the higher molecular weight range and the ubiquitous n-alkanes (n-C24-C34) with a strong odd-over-even carbon number dominance were utilized as terrigenous markers. Several ratios of these terrigenous and Bay markers were calculated for each station. These ratios and the statistical indicators from the multivariate analysis point toward a strong terrigenous signal in the terminus of South Bay and indicate net directional movement of recently introduced sediment where nontidal currents had been considered to be minimal or nonexistent and tidal currents had been assumed to be dominant. ?? 1989.

  13. Eruption probabilities for the Lassen Volcanic Center and regional volcanism, northern California, and probabilities for large explosive eruptions in the Cascade Range

    USGS Publications Warehouse

    Nathenson, Manuel; Clynne, Michael A.; Muffler, L.J. Patrick

    2012-01-01

    Chronologies for eruptive activity of the Lassen Volcanic Center and for eruptions from the regional mafic vents in the surrounding area of the Lassen segment of the Cascade Range are here used to estimate probabilities of future eruptions. For the regional mafic volcanism, the ages of many vents are known only within broad ranges, and two models are developed that should bracket the actual eruptive ages. These chronologies are used with exponential, Weibull, and mixed-exponential probability distributions to match the data for time intervals between eruptions. For the Lassen Volcanic Center, the probability of an eruption in the next year is 1.4x10-4 for the exponential distribution and 2.3x10-4 for the mixed exponential distribution. For the regional mafic vents, the exponential distribution gives a probability of an eruption in the next year of 6.5x10-4, but the mixed exponential distribution indicates that the current probability, 12,000 years after the last event, could be significantly lower. For the exponential distribution, the highest probability is for an eruption from a regional mafic vent. Data on areas and volumes of lava flows and domes of the Lassen Volcanic Center and of eruptions from the regional mafic vents provide constraints on the probable sizes of future eruptions. Probabilities of lava-flow coverage are similar for the Lassen Volcanic Center and for regional mafic vents, whereas the probable eruptive volumes for the mafic vents are generally smaller. Data have been compiled for large explosive eruptions (>≈ 5 km3 in deposit volume) in the Cascade Range during the past 1.2 m.y. in order to estimate probabilities of eruption. For erupted volumes >≈5 km3, the rate of occurrence since 13.6 ka is much higher than for the entire period, and we use these data to calculate the annual probability of a large eruption at 4.6x10-4. For erupted volumes ≥10 km3, the rate of occurrence has been reasonably constant from 630 ka to the present, giving more confidence in the estimate, and we use those data to calculate the annual probability of a large eruption in the next year at 1.4x10-5.

  14. Burst wait time simulation of CALIBAN reactor at delayed super-critical state

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Humbert, P.; Authier, N.; Richard, B.

    2012-07-01

    In the past, the super prompt critical wait time probability distribution was measured on CALIBAN fast burst reactor [4]. Afterwards, these experiments were simulated with a very good agreement by solving the non-extinction probability equation [5]. Recently, the burst wait time probability distribution has been measured at CEA-Valduc on CALIBAN at different delayed super-critical states [6]. However, in the delayed super-critical case the non-extinction probability does not give access to the wait time distribution. In this case it is necessary to compute the time dependent evolution of the full neutron count number probability distribution. In this paper we present themore » point model deterministic method used to calculate the probability distribution of the wait time before a prescribed count level taking into account prompt neutrons and delayed neutron precursors. This method is based on the solution of the time dependent adjoint Kolmogorov master equations for the number of detections using the generating function methodology [8,9,10] and inverse discrete Fourier transforms. The obtained results are then compared to the measurements and Monte-Carlo calculations based on the algorithm presented in [7]. (authors)« less

  15. Steady state, relaxation and first-passage properties of a run-and-tumble particle in one-dimension

    NASA Astrophysics Data System (ADS)

    Malakar, Kanaya; Jemseena, V.; Kundu, Anupam; Vijay Kumar, K.; Sabhapandit, Sanjib; Majumdar, Satya N.; Redner, S.; Dhar, Abhishek

    2018-04-01

    We investigate the motion of a run-and-tumble particle (RTP) in one dimension. We find the exact probability distribution of the particle with and without diffusion on the infinite line, as well as in a finite interval. In the infinite domain, this probability distribution approaches a Gaussian form in the long-time limit, as in the case of a regular Brownian particle. At intermediate times, this distribution exhibits unexpected multi-modal forms. In a finite domain, the probability distribution reaches a steady-state form with peaks at the boundaries, in contrast to a Brownian particle. We also study the relaxation to the steady-state analytically. Finally we compute the survival probability of the RTP in a semi-infinite domain with an absorbing boundary condition at the origin. In the finite interval, we compute the exit probability and the associated exit times. We provide numerical verification of our analytical results.

  16. Evaluation of the microscopic distribution of florfenicol in feed pellets for salmon by Fourier Transform infrared imaging and multivariate analysis.

    PubMed

    Bastidas, Camila Y; von Plessing, Carlos; Troncoso, José; Del P Castillo, Rosario

    2018-04-15

    Fourier Transform infrared imaging and multivariate analysis were used to identify, at the microscopic level, the presence of florfenicol (FF), a heavily-used antibiotic in the salmon industry, supplied to fishes in feed pellets for the treatment of salmonid rickettsial septicemia (SRS). The FF distribution was evaluated using Principal Component Analysis (PCA) and Augmented Multivariate Curve Resolution with Alternating Least Squares (augmented MCR-ALS) on the spectra obtained from images with pixel sizes of 6.25 μm × 6.25 μm and 1.56 μm × 1.56 μm, in different zones of feed pellets. Since the concentration of the drug was 3.44 mg FF/g pellet, this is the first report showing the powerful ability of the used of spectroscopic techniques and multivariate analysis, especially the augmented MCR-ALS, to describe the FF distribution in both the surface and inner parts of feed pellets at low concentration, in a complex matrix and at the microscopic level. The results allow monitoring the incorporation of the drug into the feed pellets. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. Fitness Probability Distribution of Bit-Flip Mutation.

    PubMed

    Chicano, Francisco; Sutton, Andrew M; Whitley, L Darrell; Alba, Enrique

    2015-01-01

    Bit-flip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bit-flip mutation. We prove that this probability distribution can be expressed as a polynomial in p, the probability of flipping each bit. We analyze these polynomials and provide closed-form expressions for an easy linear problem (Onemax), and an NP-hard problem, MAX-SAT. We also discuss a connection of the results with runtime analysis.

  18. The Impact of an Instructional Intervention Designed to Support Development of Stochastic Understanding of Probability Distribution

    ERIC Educational Resources Information Center

    Conant, Darcy Lynn

    2013-01-01

    Stochastic understanding of probability distribution undergirds development of conceptual connections between probability and statistics and supports development of a principled understanding of statistical inference. This study investigated the impact of an instructional course intervention designed to support development of stochastic…

  19. Use of uninformative priors to initialize state estimation for dynamical systems

    NASA Astrophysics Data System (ADS)

    Worthy, Johnny L.; Holzinger, Marcus J.

    2017-10-01

    The admissible region must be expressed probabilistically in order to be used in Bayesian estimation schemes. When treated as a probability density function (PDF), a uniform admissible region can be shown to have non-uniform probability density after a transformation. An alternative approach can be used to express the admissible region probabilistically according to the Principle of Transformation Groups. This paper uses a fundamental multivariate probability transformation theorem to show that regardless of which state space an admissible region is expressed in, the probability density must remain the same under the Principle of Transformation Groups. The admissible region can be shown to be analogous to an uninformative prior with a probability density that remains constant under reparameterization. This paper introduces requirements on how these uninformative priors may be transformed and used for state estimation and the difference in results when initializing an estimation scheme via a traditional transformation versus the alternative approach.

  20. Probability distributions of the electroencephalogram envelope of preterm infants.

    PubMed

    Saji, Ryoya; Hirasawa, Kyoko; Ito, Masako; Kusuda, Satoshi; Konishi, Yukuo; Taga, Gentaro

    2015-06-01

    To determine the stationary characteristics of electroencephalogram (EEG) envelopes for prematurely born (preterm) infants and investigate the intrinsic characteristics of early brain development in preterm infants. Twenty neurologically normal sets of EEGs recorded in infants with a post-conceptional age (PCA) range of 26-44 weeks (mean 37.5 ± 5.0 weeks) were analyzed. Hilbert transform was applied to extract the envelope. We determined the suitable probability distribution of the envelope and performed a statistical analysis. It was found that (i) the probability distributions for preterm EEG envelopes were best fitted by lognormal distributions at 38 weeks PCA or less, and by gamma distributions at 44 weeks PCA; (ii) the scale parameter of the lognormal distribution had positive correlations with PCA as well as a strong negative correlation with the percentage of low-voltage activity; (iii) the shape parameter of the lognormal distribution had significant positive correlations with PCA; (iv) the statistics of mode showed significant linear relationships with PCA, and, therefore, it was considered a useful index in PCA prediction. These statistics, including the scale parameter of the lognormal distribution and the skewness and mode derived from a suitable probability distribution, may be good indexes for estimating stationary nature in developing brain activity in preterm infants. The stationary characteristics, such as discontinuity, asymmetry, and unimodality, of preterm EEGs are well indicated by the statistics estimated from the probability distribution of the preterm EEG envelopes. Copyright © 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  1. Nomogram for prediction of level 2 axillary lymph node metastasis in proven level 1 node-positive breast cancer patients.

    PubMed

    Jiang, Yanlin; Xu, Hong; Zhang, Hao; Ou, Xunyan; Xu, Zhen; Ai, Liping; Sun, Lisha; Liu, Caigang

    2017-09-22

    The current management of the axilla in level 1 node-positive breast cancer patients is axillary lymph node dissection regardless of the status of the level 2 axillary lymph nodes. The goal of this study was to develop a nomogram predicting the probability of level 2 axillary lymph node metastasis (L-2-ALNM) in patients with level 1 axillary node-positive breast cancer. We reviewed the records of 974 patients with pathology-confirmed level 1 node-positive breast cancer between 2010 and 2014 at the Liaoning Cancer Hospital and Institute. The patients were randomized 1:1 and divided into a modeling group and a validation group. Clinical and pathological features of the patients were assessed with uni- and multivariate logistic regression. A nomogram based on independent predictors for the L-2-ALNM identified by multivariate logistic regression was constructed. Independent predictors of L-2-ALNM by the multivariate logistic regression analysis included tumor size, Ki-67 status, histological grade, and number of positive level 1 axillary lymph nodes. The areas under the receiver operating characteristic curve of the modeling set and the validation set were 0.828 and 0.816, respectively. The false-negative rates of the L-2-ALNM nomogram were 1.82% and 7.41% for the predicted probability cut-off points of < 6% and < 10%, respectively, when applied to the validation group. Our nomogram could help predict L-2-ALNM in patients with level 1 axillary lymph node metastasis. Patients with a low probability of L-2-ALNM could be spared level 2 axillary lymph node dissection, thereby reducing postoperative morbidity.

  2. Flood Frequency Curves - Use of information on the likelihood of extreme floods

    NASA Astrophysics Data System (ADS)

    Faber, B.

    2011-12-01

    Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.

  3. A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Bhaduri, Budhendra L

    2011-01-01

    Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less

  4. DEFINITION OF MULTIVARIATE GEOCHEMICAL ASSOCIATIONS WITH POLYMETALLIC MINERAL OCCURRENCES USING A SPATIALLY DEPENDENT CLUSTERING TECHNIQUE AND RASTERIZED STREAM SEDIMENT DATA - AN ALASKAN EXAMPLE.

    USGS Publications Warehouse

    Jenson, Susan K.; Trautwein, C.M.

    1984-01-01

    The application of an unsupervised, spatially dependent clustering technique (AMOEBA) to interpolated raster arrays of stream sediment data has been found to provide useful multivariate geochemical associations for modeling regional polymetallic resource potential. The technique is based on three assumptions regarding the compositional and spatial relationships of stream sediment data and their regional significance. These assumptions are: (1) compositionally separable classes exist and can be statistically distinguished; (2) the classification of multivariate data should minimize the pair probability of misclustering to establish useful compositional associations; and (3) a compositionally defined class represented by three or more contiguous cells within an array is a more important descriptor of a terrane than a class represented by spatial outliers.

  5. Effect of Climate Change on Mediterranean Winter Ranges of Two Migratory Passerines

    PubMed Central

    Tellería, José L.; Fernández-López, Javier; Fandos, Guillermo

    2016-01-01

    We studied the effect of climate change on the distribution of two insectivorous passerines (the meadow pipit Anthus pratensis and the chiffchaff Phylloscopus collybita) in wintering grounds of the Western Mediterranean basin. In this region, precipitation and temperature can affect the distribution of these birds through direct (thermoregulation costs) or indirect effects (primary productivity). Thus, it can be postulated that projected climate changes in the region will affect the extent and suitability of their wintering grounds. We studied pipit and chiffchaff abundance in several hundred localities along a belt crossing Spain and Morocco and assessed the effects of climate and other geographical and habitat predictors on bird distribution. Multivariate analyses reported a positive effect of temperature on the present distribution of the two species, with an additional effect of precipitation on the meadow pipit. These climate variables were used with Maxent to model the occurrence probabilities of species using ring recoveries as presence data. Abundance and occupancy of the two species in the study localities adjusted to the distribution models, with more birds in sectors of high climate suitability. After validation, these models were used to forecast the distribution of climate suitability according to climate projections for 2050–2070 (temperature increase and precipitation reduction). Results show an expansion of climatically suitable sectors into the highlands by the effect of warming on the two species, and a retreat of the meadow pipit from southern sectors related to rain reduction. The predicted patterns show a mean increase in climate suitability for the two species due to the warming of the large highland expanses typical of the western Mediterranean. PMID:26761791

  6. Probability of cesarean delivery after successful external cephalic version.

    PubMed

    Burgos, Jorge; Iglesias, María; Pijoan, José I; Rodriguez, Leire; Fernández-Llebrez, Luis; Martínez-Astorquiza, Txantón

    2015-11-01

    To identify factors associated with cesarean delivery following successful external cephalic version (ECV). In a prospective study, data were obtained for ECV procedures performed at Cruces University Hospital, Spain, between March 2002 and June 2012. Women with a singleton pregnancy who had a successful, uncomplicated ECV and whose delivery was assisted at the study hospital, with the fetus in cephalic presentation, were included. A multivariate model of risk factors of cesarean delivery was developed. Among 627 women included, 92 (14.7%) delivered by cesarean. A cesarean was performed among 33 (8.5%) of 387 women with spontaneous labor versus 59 (24.6%) of 240 who were induced (P < 0.001). Multivariate analysis showed that higher BMI (P = 0.006), labor induction (P = 0.001), and prior cesarean (P < 0.001) were associated with cesarean. Time between ECV and delivery was inversely associated with probability of cesarean during the first 2 weeks. Thus, the probabilities of cesarean delivery on the first day were 0.53 (95% CI 0.35-0.71) and 0.34 (95% CI 0.18-0.51) following induced and spontaneous labor, respectively. On the seventh day, the probabilities were 0.23 (95% CI 0.15-0.32) and 0.12 (95% CI 0.07-0.18), respectively. Following ECV, induction of labor, an interval of less than 2 weeks to delivery, BMI, and previous cesarean were associated with an increased risk of cesarean. Copyright © 2015 International Federation of Gynecology and Obstetrics. Published by Elsevier Ireland Ltd. All rights reserved.

  7. Stylized facts in internal rates of return on stock index and its derivative transactions

    NASA Astrophysics Data System (ADS)

    Pichl, Lukáš; Kaizoji, Taisei; Yamano, Takuya

    2007-08-01

    Universal features in stock markets and their derivative markets are studied by means of probability distributions in internal rates of return on buy and sell transaction pairs. Unlike the stylized facts in normalized log returns, the probability distributions for such single asset encounters incorporate the time factor by means of the internal rate of return, defined as the continuous compound interest. Resulting stylized facts are shown in the probability distributions derived from the daily series of TOPIX, S & P 500 and FTSE 100 index close values. The application of the above analysis to minute-tick data of NIKKEI 225 and its futures market, respectively, reveals an interesting difference in the behavior of the two probability distributions, in case a threshold on the minimal duration of the long position is imposed. It is therefore suggested that the probability distributions of the internal rates of return could be used for causality mining between the underlying and derivative stock markets. The highly specific discrete spectrum, which results from noise trader strategies as opposed to the smooth distributions observed for fundamentalist strategies in single encounter transactions may be useful in deducing the type of investment strategy from trading revenues of small portfolio investors.

  8. Probabilistic Reasoning for Robustness in Automated Planning

    NASA Technical Reports Server (NTRS)

    Schaffer, Steven; Clement, Bradley; Chien, Steve

    2007-01-01

    A general-purpose computer program for planning the actions of a spacecraft or other complex system has been augmented by incorporating a subprogram that reasons about uncertainties in such continuous variables as times taken to perform tasks and amounts of resources to be consumed. This subprogram computes parametric probability distributions for time and resource variables on the basis of user-supplied models of actions and resources that they consume. The current system accepts bounded Gaussian distributions over action duration and resource use. The distributions are then combined during planning to determine the net probability distribution of each resource at any time point. In addition to a full combinatoric approach, several approximations for arriving at these combined distributions are available, including maximum-likelihood and pessimistic algorithms. Each such probability distribution can then be integrated to obtain a probability that execution of the plan under consideration would violate any constraints on the resource. The key idea is to use these probabilities of conflict to score potential plans and drive a search toward planning low-risk actions. An output plan provides a balance between the user s specified averseness to risk and other measures of optimality.

  9. Mathematical Model to estimate the wind power using four-parameter Burr distribution

    NASA Astrophysics Data System (ADS)

    Liu, Sanming; Wang, Zhijie; Pan, Zhaoxu

    2018-03-01

    When the real probability of wind speed in the same position needs to be described, the four-parameter Burr distribution is more suitable than other distributions. This paper introduces its important properties and characteristics. Also, the application of the four-parameter Burr distribution in wind speed prediction is discussed, and the expression of probability distribution of output power of wind turbine is deduced.

  10. Entropy Methods For Univariate Distributions in Decision Analysis

    NASA Astrophysics Data System (ADS)

    Abbas, Ali E.

    2003-03-01

    One of the most important steps in decision analysis practice is the elicitation of the decision-maker's belief about an uncertainty of interest in the form of a representative probability distribution. However, the probability elicitation process is a task that involves many cognitive and motivational biases. Alternatively, the decision-maker may provide other information about the distribution of interest, such as its moments, and the maximum entropy method can be used to obtain a full distribution subject to the given moment constraints. In practice however, decision makers cannot readily provide moments for the distribution, and are much more comfortable providing information about the fractiles of the distribution of interest or bounds on its cumulative probabilities. In this paper we present a graphical method to determine the maximum entropy distribution between upper and lower probability bounds and provide an interpretation for the shape of the maximum entropy distribution subject to fractile constraints, (FMED). We also discuss the problems with the FMED in that it is discontinuous and flat over each fractile interval. We present a heuristic approximation to a distribution if in addition to its fractiles, we also know it is continuous and work through full examples to illustrate the approach.

  11. Modelling Spatial Dependence Structures Between Climate Variables by Combining Mixture Models with Copula Models

    NASA Astrophysics Data System (ADS)

    Khan, F.; Pilz, J.; Spöck, G.

    2017-12-01

    Spatio-temporal dependence structures play a pivotal role in understanding the meteorological characteristics of a basin or sub-basin. This further affects the hydrological conditions and consequently will provide misleading results if these structures are not taken into account properly. In this study we modeled the spatial dependence structure between climate variables including maximum, minimum temperature and precipitation in the Monsoon dominated region of Pakistan. For temperature, six, and for precipitation four meteorological stations have been considered. For modelling the dependence structure between temperature and precipitation at multiple sites, we utilized C-Vine, D-Vine and Student t-copula models. For temperature, multivariate mixture normal distributions and for precipitation gamma distributions have been used as marginals under the copula models. A comparison was made between C-Vine, D-Vine and Student t-copula by observational and simulated spatial dependence structure to choose an appropriate model for the climate data. The results show that all copula models performed well, however, there are subtle differences in their performances. The copula models captured the patterns of spatial dependence structures between climate variables at multiple meteorological sites, however, the t-copula showed poor performance in reproducing the dependence structure with respect to magnitude. It was observed that important statistics of observed data have been closely approximated except of maximum values for temperature and minimum values for minimum temperature. Probability density functions of simulated data closely follow the probability density functions of observational data for all variables. C and D-Vines are better tools when it comes to modelling the dependence between variables, however, Student t-copulas compete closely for precipitation. Keywords: Copula model, C-Vine, D-Vine, Spatial dependence structure, Monsoon dominated region of Pakistan, Mixture models, EM algorithm.

  12. Work probability distribution for a ferromagnet with long-ranged and short-ranged correlations

    NASA Astrophysics Data System (ADS)

    Bhattacharjee, J. K.; Kirkpatrick, T. R.; Sengers, J. V.

    2018-04-01

    Work fluctuations and work probability distributions are fundamentally different in systems with short-ranged versus long-ranged correlations. Specifically, in systems with long-ranged correlations the work distribution is extraordinarily broad compared to systems with short-ranged correlations. This difference profoundly affects the possible applicability of fluctuation theorems like the Jarzynski fluctuation theorem. The Heisenberg ferromagnet, well below its Curie temperature, is a system with long-ranged correlations in very low magnetic fields due to the presence of Goldstone modes. As the magnetic field is increased the correlations gradually become short ranged. Hence, such a ferromagnet is an ideal system for elucidating the changes of the work probability distribution as one goes from a domain with long-ranged correlations to a domain with short-ranged correlations by tuning the magnetic field. A quantitative analysis of this crossover behavior of the work probability distribution and the associated fluctuations is presented.

  13. Modelling lifetime data with multivariate Tweedie distribution

    NASA Astrophysics Data System (ADS)

    Nor, Siti Rohani Mohd; Yusof, Fadhilah; Bahar, Arifah

    2017-05-01

    This study aims to measure the dependence between individual lifetimes by applying multivariate Tweedie distribution to the lifetime data. Dependence between lifetimes incorporated in the mortality model is a new form of idea that gives significant impact on the risk of the annuity portfolio which is actually against the idea of standard actuarial methods that assumes independent between lifetimes. Hence, this paper applies Tweedie family distribution to the portfolio of lifetimes to induce the dependence between lives. Tweedie distribution is chosen since it contains symmetric and non-symmetric, as well as light-tailed and heavy-tailed distributions. Parameter estimation is modified in order to fit the Tweedie distribution to the data. This procedure is developed by using method of moments. In addition, the comparison stage is made to check for the adequacy between the observed mortality and expected mortality. Finally, the importance of including systematic mortality risk in the model is justified by the Pearson's chi-squared test.

  14. Methods to elicit probability distributions from experts: a systematic review of reported practice in health technology assessment.

    PubMed

    Grigore, Bogdan; Peters, Jaime; Hyde, Christopher; Stein, Ken

    2013-11-01

    Elicitation is a technique that can be used to obtain probability distribution from experts about unknown quantities. We conducted a methodology review of reports where probability distributions had been elicited from experts to be used in model-based health technology assessments. Databases including MEDLINE, EMBASE and the CRD database were searched from inception to April 2013. Reference lists were checked and citation mapping was also used. Studies describing their approach to the elicitation of probability distributions were included. Data was abstracted on pre-defined aspects of the elicitation technique. Reports were critically appraised on their consideration of the validity, reliability and feasibility of the elicitation exercise. Fourteen articles were included. Across these studies, the most marked features were heterogeneity in elicitation approach and failure to report key aspects of the elicitation method. The most frequently used approaches to elicitation were the histogram technique and the bisection method. Only three papers explicitly considered the validity, reliability and feasibility of the elicitation exercises. Judged by the studies identified in the review, reports of expert elicitation are insufficient in detail and this impacts on the perceived usability of expert-elicited probability distributions. In this context, the wider credibility of elicitation will only be improved by better reporting and greater standardisation of approach. Until then, the advantage of eliciting probability distributions from experts may be lost.

  15. Atmospheric conditions, lunar phases, and childbirth: a multivariate analysis

    NASA Astrophysics Data System (ADS)

    Ochiai, Angela Megumi; Gonçalves, Fabio Luiz Teixeira; Ambrizzi, Tercio; Florentino, Lucia Cristina; Wei, Chang Yi; Soares, Alda Valeria Neves; De Araujo, Natalucia Matos; Gualda, Dulce Maria Rosa

    2012-07-01

    Our objective was to assess extrinsic influences upon childbirth. In a cohort of 1,826 days containing 17,417 childbirths among them 13,252 spontaneous labor admissions, we studied the influence of environment upon the high incidence of labor (defined by 75th percentile or higher), analyzed by logistic regression. The predictors of high labor admission included increases in outdoor temperature (odds ratio: 1.742, P = 0.045, 95%CI: 1.011 to 3.001), and decreases in atmospheric pressure (odds ratio: 1.269, P = 0.029, 95%CI: 1.055 to 1.483). In contrast, increases in tidal range were associated with a lower probability of high admission (odds ratio: 0.762, P = 0.030, 95%CI: 0.515 to 0.999). Lunar phase was not a predictor of high labor admission ( P = 0.339). Using multivariate analysis, increases in temperature and decreases in atmospheric pressure predicted high labor admission, and increases of tidal range, as a measurement of the lunar gravitational force, predicted a lower probability of high admission.

  16. Multivariate statistical model for 3D image segmentation with application to medical images.

    PubMed

    John, Nigel M; Kabuka, Mansur R; Ibrahim, Mohamed O

    2003-12-01

    In this article we describe a statistical model that was developed to segment brain magnetic resonance images. The statistical segmentation algorithm was applied after a pre-processing stage involving the use of a 3D anisotropic filter along with histogram equalization techniques. The segmentation algorithm makes use of prior knowledge and a probability-based multivariate model designed to semi-automate the process of segmentation. The algorithm was applied to images obtained from the Center for Morphometric Analysis at Massachusetts General Hospital as part of the Internet Brain Segmentation Repository (IBSR). The developed algorithm showed improved accuracy over the k-means, adaptive Maximum Apriori Probability (MAP), biased MAP, and other algorithms. Experimental results showing the segmentation and the results of comparisons with other algorithms are provided. Results are based on an overlap criterion against expertly segmented images from the IBSR. The algorithm produced average results of approximately 80% overlap with the expertly segmented images (compared with 85% for manual segmentation and 55% for other algorithms).

  17. Effect of abdominopelvic abscess drain size on drainage time and probability of occlusion

    PubMed Central

    Rotman, Jessica A.; Getrajdman, George I.; Maybody, Majid; Erinjeri, Joseph P.; Yarmohammadi, Hooman; Sofocleous, Constantinos T.; Solomon, Stephen B.; Boas, F. Edward

    2016-01-01

    Background The purpose of this study is to determine whether larger abdominopelvic abscess drains reduce the time required for abscess resolution, or the probability of tube occlusion. Methods 144 consecutive patients who underwent abscess drainage at a single institution were reviewed retrospectively. Results: Larger initial drain size did not reduce drainage time, drain occlusion, or drain exchanges (p>0.05). Subgroup analysis did not find any type of collection that benefitted from larger drains. A multivariate model predicting drainage time showed that large collections (>200 ml) required 16 days longer drainage time than small collections (<50 ml). Collections with a fistula to bowel required 17 days longer drainage time than collections without a fistula. Initial drain size and the viscosity of the fluid in the collection had no significant effect on drainage time in the multivariate model. Conclusions 8 F drains are adequate for initial drainage of most serous and serosanguineous collections. 10 F drains are adequate for initial drainage of most purulent or bloody collections. PMID:27634422

  18. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

    PubMed

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

    2012-03-15

    To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.

  19. Modeling and Simulation of Upset-Inducing Disturbances for Digital Systems in an Electromagnetic Reverberation Chamber

    NASA Technical Reports Server (NTRS)

    Torres-Pomales, Wilfredo

    2014-01-01

    This report describes a modeling and simulation approach for disturbance patterns representative of the environment experienced by a digital system in an electromagnetic reverberation chamber. The disturbance is modeled by a multi-variate statistical distribution based on empirical observations. Extended versions of the Rejection Samping and Inverse Transform Sampling techniques are developed to generate multi-variate random samples of the disturbance. The results show that Inverse Transform Sampling returns samples with higher fidelity relative to the empirical distribution. This work is part of an ongoing effort to develop a resilience assessment methodology for complex safety-critical distributed systems.

  20. Computer simulation of random variables and vectors with arbitrary probability distribution laws

    NASA Technical Reports Server (NTRS)

    Bogdan, V. M.

    1981-01-01

    Assume that there is given an arbitrary n-dimensional probability distribution F. A recursive construction is found for a sequence of functions x sub 1 = f sub 1 (U sub 1, ..., U sub n), ..., x sub n = f sub n (U sub 1, ..., U sub n) such that if U sub 1, ..., U sub n are independent random variables having uniform distribution over the open interval (0,1), then the joint distribution of the variables x sub 1, ..., x sub n coincides with the distribution F. Since uniform independent random variables can be well simulated by means of a computer, this result allows one to simulate arbitrary n-random variables if their joint probability distribution is known.

  1. Nuclear risk analysis of the Ulysses mission

    NASA Astrophysics Data System (ADS)

    Bartram, Bart W.; Vaughan, Frank R.; Englehart, Richard W., Dr.

    1991-01-01

    The use of a radioisotope thermoelectric generator fueled with plutonium-238 dioxide on the Space Shuttle-launched Ulysses mission implies some level of risk due to potential accidents. This paper describes the method used to quantify risks in the Ulysses mission Final Safety Analysis Report prepared for the U.S. Department of Energy. The starting point for the analysis described herein is following input of source term probability distributions from the General Electric Company. A Monte Carlo technique is used to develop probability distributions of radiological consequences for a range of accident scenarios thoughout the mission. Factors affecting radiological consequences are identified, the probability distribution of the effect of each factor determined, and the functional relationship among all the factors established. The probability distributions of all the factor effects are then combined using a Monte Carlo technique. The results of the analysis are presented in terms of complementary cumulative distribution functions (CCDF) by mission sub-phase, phase, and the overall mission. The CCDFs show the total probability that consequences (calculated health effects) would be equal to or greater than a given value.

  2. Force Density Function Relationships in 2-D Granular Media

    NASA Technical Reports Server (NTRS)

    Youngquist, Robert C.; Metzger, Philip T.; Kilts, Kelly N.

    2004-01-01

    An integral transform relationship is developed to convert between two important probability density functions (distributions) used in the study of contact forces in granular physics. Developing this transform has now made it possible to compare and relate various theoretical approaches with one another and with the experimental data despite the fact that one may predict the Cartesian probability density and another the force magnitude probability density. Also, the transforms identify which functional forms are relevant to describe the probability density observed in nature, and so the modified Bessel function of the second kind has been identified as the relevant form for the Cartesian probability density corresponding to exponential forms in the force magnitude distribution. Furthermore, it is shown that this transform pair supplies a sufficient mathematical framework to describe the evolution of the force magnitude distribution under shearing. Apart from the choice of several coefficients, whose evolution of values must be explained in the physics, this framework successfully reproduces the features of the distribution that are taken to be an indicator of jamming and unjamming in a granular packing. Key words. Granular Physics, Probability Density Functions, Fourier Transforms

  3. An evaluation of procedures to estimate monthly precipitation probabilities

    NASA Astrophysics Data System (ADS)

    Legates, David R.

    1991-01-01

    Many frequency distributions have been used to evaluate monthly precipitation probabilities. Eight of these distributions (including Pearson type III, extreme value, and transform normal probability density functions) are comparatively examined to determine their ability to represent accurately variations in monthly precipitation totals for global hydroclimatological analyses. Results indicate that a modified version of the Box-Cox transform-normal distribution more adequately describes the 'true' precipitation distribution than does any of the other methods. This assessment was made using a cross-validation procedure for a global network of 253 stations for which at least 100 years of monthly precipitation totals were available.

  4. q-Gaussian distributions of leverage returns, first stopping times, and default risk valuations

    NASA Astrophysics Data System (ADS)

    Katz, Yuri A.; Tian, Li

    2013-10-01

    We study the probability distributions of daily leverage returns of 520 North American industrial companies that survive de-listing during the financial crisis, 2006-2012. We provide evidence that distributions of unbiased leverage returns of all individual firms belong to the class of q-Gaussian distributions with the Tsallis entropic parameter within the interval 1

  5. Probability weighted moments: Definition and relation to parameters of several distributions expressable in inverse form

    USGS Publications Warehouse

    Greenwood, J. Arthur; Landwehr, J. Maciunas; Matalas, N.C.; Wallis, J.R.

    1979-01-01

    Distributions whose inverse forms are explicitly defined, such as Tukey's lambda, may present problems in deriving their parameters by more conventional means. Probability weighted moments are introduced and shown to be potentially useful in expressing the parameters of these distributions.

  6. Univariate Probability Distributions

    ERIC Educational Resources Information Center

    Leemis, Lawrence M.; Luckett, Daniel J.; Powell, Austin G.; Vermeer, Peter E.

    2012-01-01

    We describe a web-based interactive graphic that can be used as a resource in introductory classes in mathematical statistics. This interactive graphic presents 76 common univariate distributions and gives details on (a) various features of the distribution such as the functional form of the probability density function and cumulative distribution…

  7. A probability space for quantum models

    NASA Astrophysics Data System (ADS)

    Lemmens, L. F.

    2017-06-01

    A probability space contains a set of outcomes, a collection of events formed by subsets of the set of outcomes and probabilities defined for all events. A reformulation in terms of propositions allows to use the maximum entropy method to assign the probabilities taking some constraints into account. The construction of a probability space for quantum models is determined by the choice of propositions, choosing the constraints and making the probability assignment by the maximum entropy method. This approach shows, how typical quantum distributions such as Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein are partly related with well-known classical distributions. The relation between the conditional probability density, given some averages as constraints and the appropriate ensemble is elucidated.

  8. Multivariate Analysis, Retrieval, and Storage System (MARS). Volume 1: MARS System and Analysis Techniques

    NASA Technical Reports Server (NTRS)

    Hague, D. S.; Vanderberg, J. D.; Woodbury, N. W.

    1974-01-01

    A method for rapidly examining the probable applicability of weight estimating formulae to a specific aerospace vehicle design is presented. The Multivariate Analysis Retrieval and Storage System (MARS) is comprised of three computer programs which sequentially operate on the weight and geometry characteristics of past aerospace vehicles designs. Weight and geometric characteristics are stored in a set of data bases which are fully computerized. Additional data bases are readily added to the MARS system and/or the existing data bases may be easily expanded to include additional vehicles or vehicle characteristics.

  9. Regional probability distribution of the annual reference evapotranspiration and its effective parameters in Iran

    NASA Astrophysics Data System (ADS)

    Khanmohammadi, Neda; Rezaie, Hossein; Montaseri, Majid; Behmanesh, Javad

    2017-10-01

    The reference evapotranspiration (ET0) plays an important role in water management plans in arid or semi-arid countries such as Iran. For this reason, the regional analysis of this parameter is important. But, ET0 process is affected by several meteorological parameters such as wind speed, solar radiation, temperature and relative humidity. Therefore, the effect of distribution type of effective meteorological variables on ET0 distribution was analyzed. For this purpose, the regional probability distribution of the annual ET0 and its effective parameters were selected. Used data in this research was recorded data at 30 synoptic stations of Iran during 1960-2014. Using the probability plot correlation coefficient (PPCC) test and the L-moment method, five common distributions were compared and the best distribution was selected. The results of PPCC test and L-moment diagram indicated that the Pearson type III distribution was the best probability distribution for fitting annual ET0 and its four effective parameters. The results of RMSE showed that the ability of the PPCC test and L-moment method for regional analysis of reference evapotranspiration and its effective parameters was similar. The results also showed that the distribution type of the parameters which affected ET0 values can affect the distribution of reference evapotranspiration.

  10. Probabilistic reasoning in data analysis.

    PubMed

    Sirovich, Lawrence

    2011-09-20

    This Teaching Resource provides lecture notes, slides, and a student assignment for a lecture on probabilistic reasoning in the analysis of biological data. General probabilistic frameworks are introduced, and a number of standard probability distributions are described using simple intuitive ideas. Particular attention is focused on random arrivals that are independent of prior history (Markovian events), with an emphasis on waiting times, Poisson processes, and Poisson probability distributions. The use of these various probability distributions is applied to biomedical problems, including several classic experimental studies.

  11. Multiobjective fuzzy stochastic linear programming problems with inexact probability distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamadameen, Abdulqader Othman; Zainuddin, Zaitul Marlizawati

    This study deals with multiobjective fuzzy stochastic linear programming problems with uncertainty probability distribution which are defined as fuzzy assertions by ambiguous experts. The problem formulation has been presented and the two solutions strategies are; the fuzzy transformation via ranking function and the stochastic transformation when α{sup –}. cut technique and linguistic hedges are used in the uncertainty probability distribution. The development of Sen’s method is employed to find a compromise solution, supported by illustrative numerical example.

  12. Work probability distribution and tossing a biased coin

    NASA Astrophysics Data System (ADS)

    Saha, Arnab; Bhattacharjee, Jayanta K.; Chakraborty, Sagar

    2011-01-01

    We show that the rare events present in dissipated work that enters Jarzynski equality, when mapped appropriately to the phenomenon of large deviations found in a biased coin toss, are enough to yield a quantitative work probability distribution for the Jarzynski equality. This allows us to propose a recipe for constructing work probability distribution independent of the details of any relevant system. The underlying framework, developed herein, is expected to be of use in modeling other physical phenomena where rare events play an important role.

  13. Hybrid computer technique yields random signal probability distributions

    NASA Technical Reports Server (NTRS)

    Cameron, W. D.

    1965-01-01

    Hybrid computer determines the probability distributions of instantaneous and peak amplitudes of random signals. This combined digital and analog computer system reduces the errors and delays of manual data analysis.

  14. Fast Reliability Assessing Method for Distribution Network with Distributed Renewable Energy Generation

    NASA Astrophysics Data System (ADS)

    Chen, Fan; Huang, Shaoxiong; Ding, Jinjin; Ding, Jinjin; Gao, Bo; Xie, Yuguang; Wang, Xiaoming

    2018-01-01

    This paper proposes a fast reliability assessing method for distribution grid with distributed renewable energy generation. First, the Weibull distribution and the Beta distribution are used to describe the probability distribution characteristics of wind speed and solar irradiance respectively, and the models of wind farm, solar park and local load are built for reliability assessment. Then based on power system production cost simulation probability discretization and linearization power flow, a optimal power flow objected with minimum cost of conventional power generation is to be resolved. Thus a reliability assessment for distribution grid is implemented fast and accurately. The Loss Of Load Probability (LOLP) and Expected Energy Not Supplied (EENS) are selected as the reliability index, a simulation for IEEE RBTS BUS6 system in MATLAB indicates that the fast reliability assessing method calculates the reliability index much faster with the accuracy ensured when compared with Monte Carlo method.

  15. 40 CFR Appendix C to Part 191 - Guidance for Implementation of Subpart B

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... that the remaining probability distribution of cumulative releases would not be significantly changed... with § 191.13 into a “complementary cumulative distribution function” that indicates the probability of... distribution function for each disposal system considered. The Agency assumes that a disposal system can be...

  16. 40 CFR Appendix C to Part 191 - Guidance for Implementation of Subpart B

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... that the remaining probability distribution of cumulative releases would not be significantly changed... with § 191.13 into a “complementary cumulative distribution function” that indicates the probability of... distribution function for each disposal system considered. The Agency assumes that a disposal system can be...

  17. Optimal methods for fitting probability distributions to propagule retention time in studies of zoochorous dispersal.

    PubMed

    Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi

    2016-02-01

    Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.

  18. How Can Histograms Be Useful for Introducing Continuous Probability Distributions?

    ERIC Educational Resources Information Center

    Derouet, Charlotte; Parzysz, Bernard

    2016-01-01

    The teaching of probability has changed a great deal since the end of the last century. The development of technologies is indeed part of this evolution. In France, continuous probability distributions began to be studied in 2002 by scientific 12th graders, but this subject was marginal and appeared only as an application of integral calculus.…

  19. Pseudo Bayes Estimates for Test Score Distributions and Chained Equipercentile Equating. Research Report. ETS RR-09-47

    ERIC Educational Resources Information Center

    Moses, Tim; Oh, Hyeonjoo J.

    2009-01-01

    Pseudo Bayes probability estimates are weighted averages of raw and modeled probabilities; these estimates have been studied primarily in nonpsychometric contexts. The purpose of this study was to evaluate pseudo Bayes probability estimates as applied to the estimation of psychometric test score distributions and chained equipercentile equating…

  20. Failure probability under parameter uncertainty.

    PubMed

    Gerrard, R; Tsanakas, A

    2011-05-01

    In many problems of risk analysis, failure is equivalent to the event of a random risk factor exceeding a given threshold. Failure probabilities can be controlled if a decisionmaker is able to set the threshold at an appropriate level. This abstract situation applies, for example, to environmental risks with infrastructure controls; to supply chain risks with inventory controls; and to insurance solvency risks with capital controls. However, uncertainty around the distribution of the risk factor implies that parameter error will be present and the measures taken to control failure probabilities may not be effective. We show that parameter uncertainty increases the probability (understood as expected frequency) of failures. For a large class of loss distributions, arising from increasing transformations of location-scale families (including the log-normal, Weibull, and Pareto distributions), the article shows that failure probabilities can be exactly calculated, as they are independent of the true (but unknown) parameters. Hence it is possible to obtain an explicit measure of the effect of parameter uncertainty on failure probability. Failure probability can be controlled in two different ways: (1) by reducing the nominal required failure probability, depending on the size of the available data set, and (2) by modifying of the distribution itself that is used to calculate the risk control. Approach (1) corresponds to a frequentist/regulatory view of probability, while approach (2) is consistent with a Bayesian/personalistic view. We furthermore show that the two approaches are consistent in achieving the required failure probability. Finally, we briefly discuss the effects of data pooling and its systemic risk implications. © 2010 Society for Risk Analysis.

  1. Comparison of three-parameter probability distributions for representing annual extreme and partial duration precipitation series

    NASA Astrophysics Data System (ADS)

    Wilks, Daniel S.

    1993-10-01

    Performance of 8 three-parameter probability distributions for representing annual extreme and partial duration precipitation data at stations in the northeastern and southeastern United States is investigated. Particular attention is paid to fidelity on the right tail, through use of a bootstrap procedure simulating extrapolation on the right tail beyond the data. It is found that the beta-κ distribution best describes the extreme right tail of annual extreme series, and the beta-P distribution is best for the partial duration data. The conventionally employed two-parameter Gumbel distribution is found to substantially underestimate probabilities associated with the larger precipitation amounts for both annual extreme and partial duration data. Fitting the distributions using left-censored data did not result in improved fits to the right tail.

  2. Gaussian Mixture Models of Between-Source Variation for Likelihood Ratio Computation from Multivariate Data

    PubMed Central

    Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin

    2016-01-01

    In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680

  3. Concurrent generation of multivariate mixed data with variables of dissimilar types.

    PubMed

    Amatya, Anup; Demirtas, Hakan

    2016-01-01

    Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.

  4. Applying Multivariate Discrete Distributions to Genetically Informative Count Data.

    PubMed

    Kirkpatrick, Robert M; Neale, Michael C

    2016-03-01

    We present a novel method of conducting biometric analysis of twin data when the phenotypes are integer-valued counts, which often show an L-shaped distribution. Monte Carlo simulation is used to compare five likelihood-based approaches to modeling: our multivariate discrete method, when its distributional assumptions are correct, when they are incorrect, and three other methods in common use. With data simulated from a skewed discrete distribution, recovery of twin correlations and proportions of additive genetic and common environment variance was generally poor for the Normal, Lognormal and Ordinal models, but good for the two discrete models. Sex-separate applications to substance-use data from twins in the Minnesota Twin Family Study showed superior performance of two discrete models. The new methods are implemented using R and OpenMx and are freely available.

  5. The estimation of tree posterior probabilities using conditional clade probability distributions.

    PubMed

    Larget, Bret

    2013-07-01

    In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample.

  6. Probability Analysis of the Wave-Slamming Pressure Values of the Horizontal Deck with Elastic Support

    NASA Astrophysics Data System (ADS)

    Zuo, Weiguang; Liu, Ming; Fan, Tianhui; Wang, Pengtao

    2018-06-01

    This paper presents the probability distribution of the slamming pressure from an experimental study of regular wave slamming on an elastically supported horizontal deck. The time series of the slamming pressure during the wave impact were first obtained through statistical analyses on experimental data. The exceeding probability distribution of the maximum slamming pressure peak and distribution parameters were analyzed, and the results show that the exceeding probability distribution of the maximum slamming pressure peak accords with the three-parameter Weibull distribution. Furthermore, the range and relationships of the distribution parameters were studied. The sum of the location parameter D and the scale parameter L was approximately equal to 1.0, and the exceeding probability was more than 36.79% when the random peak was equal to the sample average during the wave impact. The variation of the distribution parameters and slamming pressure under different model conditions were comprehensively presented, and the parameter values of the Weibull distribution of wave-slamming pressure peaks were different due to different test models. The parameter values were found to decrease due to the increased stiffness of the elastic support. The damage criterion of the structure model caused by the wave impact was initially discussed, and the structure model was destroyed when the average slamming time was greater than a certain value during the duration of the wave impact. The conclusions of the experimental study were then described.

  7. On the inequivalence of the CH and CHSH inequalities due to finite statistics

    NASA Astrophysics Data System (ADS)

    Renou, M. O.; Rosset, D.; Martin, A.; Gisin, N.

    2017-06-01

    Different variants of a Bell inequality, such as CHSH and CH, are known to be equivalent when evaluated on nonsignaling outcome probability distributions. However, in experimental setups, the outcome probability distributions are estimated using a finite number of samples. Therefore the nonsignaling conditions are only approximately satisfied and the robustness of the violation depends on the chosen inequality variant. We explain that phenomenon using the decomposition of the space of outcome probability distributions under the action of the symmetry group of the scenario, and propose a method to optimize the statistical robustness of a Bell inequality. In the process, we describe the finite group composed of relabeling of parties, measurement settings and outcomes, and identify correspondences between the irreducible representations of this group and properties of outcome probability distributions such as normalization, signaling or having uniform marginals.

  8. Changes in Concurrent Risk of Warm and Dry Years under Impact of Climate Change

    NASA Astrophysics Data System (ADS)

    Sarhadi, A.; Wiper, M.; Touma, D. E.; Ausín, M. C.; Diffenbaugh, N. S.

    2017-12-01

    Anthropogenic global warming has changed the nature and the risk of extreme climate phenomena. The changing concurrence of multiple climatic extremes (warm and dry years) may result in intensification of undesirable consequences for water resources, human and ecosystem health, and environmental equity. The present study assesses how global warming influences the probability that warm and dry years co-occur in a global scale. In the first step of the study a designed multivariate Mann-Kendall trend analysis is used to detect the areas in which the concurrence of warm and dry years has increased in the historical climate records and also climate models in the global scale. The next step investigates the concurrent risk of the extremes under dynamic nonstationary conditions. A fully generalized multivariate risk framework is designed to evolve through time under dynamic nonstationary conditions. In this methodology, Bayesian, dynamic copulas are developed to model the time-varying dependence structure between the two different climate extremes (warm and dry years). The results reveal an increasing trend in the concurrence risk of warm and dry years, which are in agreement with the multivariate trend analysis from historical and climate models. In addition to providing a novel quantification of the changing probability of compound extreme events, the results of this study can help decision makers develop short- and long-term strategies to prepare for climate stresses now and in the future.

  9. Confidence as Bayesian Probability: From Neural Origins to Behavior.

    PubMed

    Meyniel, Florent; Sigman, Mariano; Mainen, Zachary F

    2015-10-07

    Research on confidence spreads across several sub-fields of psychology and neuroscience. Here, we explore how a definition of confidence as Bayesian probability can unify these viewpoints. This computational view entails that there are distinct forms in which confidence is represented and used in the brain, including distributional confidence, pertaining to neural representations of probability distributions, and summary confidence, pertaining to scalar summaries of those distributions. Summary confidence is, normatively, derived or "read out" from distributional confidence. Neural implementations of readout will trade off optimality versus flexibility of routing across brain systems, allowing confidence to serve diverse cognitive functions. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Exact probability distribution functions for Parrondo's games

    NASA Astrophysics Data System (ADS)

    Zadourian, Rubina; Saakian, David B.; Klümper, Andreas

    2016-12-01

    We study the discrete time dynamics of Brownian ratchet models and Parrondo's games. Using the Fourier transform, we calculate the exact probability distribution functions for both the capital dependent and history dependent Parrondo's games. In certain cases we find strong oscillations near the maximum of the probability distribution with two limiting distributions for odd and even number of rounds of the game. Indications of such oscillations first appeared in the analysis of real financial data, but now we have found this phenomenon in model systems and a theoretical understanding of the phenomenon. The method of our work can be applied to Brownian ratchets, molecular motors, and portfolio optimization.

  11. Exact probability distribution functions for Parrondo's games.

    PubMed

    Zadourian, Rubina; Saakian, David B; Klümper, Andreas

    2016-12-01

    We study the discrete time dynamics of Brownian ratchet models and Parrondo's games. Using the Fourier transform, we calculate the exact probability distribution functions for both the capital dependent and history dependent Parrondo's games. In certain cases we find strong oscillations near the maximum of the probability distribution with two limiting distributions for odd and even number of rounds of the game. Indications of such oscillations first appeared in the analysis of real financial data, but now we have found this phenomenon in model systems and a theoretical understanding of the phenomenon. The method of our work can be applied to Brownian ratchets, molecular motors, and portfolio optimization.

  12. Statistical primer: propensity score matching and its alternatives.

    PubMed

    Benedetto, Umberto; Head, Stuart J; Angelini, Gianni D; Blackstone, Eugene H

    2018-06-01

    Propensity score (PS) methods offer certain advantages over more traditional regression methods to control for confounding by indication in observational studies. Although multivariable regression models adjust for confounders by modelling the relationship between covariates and outcome, the PS methods estimate the treatment effect by modelling the relationship between confounders and treatment assignment. Therefore, methods based on the PS are not limited by the number of events, and their use may be warranted when the number of confounders is large, or the number of outcomes is small. The PS is the probability for a subject to receive a treatment conditional on a set of baseline characteristics (confounders). The PS is commonly estimated using logistic regression, and it is used to match patients with similar distribution of confounders so that difference in outcomes gives unbiased estimate of treatment effect. This review summarizes basic concepts of the PS matching and provides guidance in implementing matching and other methods based on the PS, such as stratification, weighting and covariate adjustment.

  13. Real-time realizations of the Bayesian Infrasonic Source Localization Method

    NASA Astrophysics Data System (ADS)

    Pinsky, V.; Arrowsmith, S.; Hofstetter, A.; Nippress, A.

    2015-12-01

    The Bayesian Infrasonic Source Localization method (BISL), introduced by Mordak et al. (2010) and upgraded by Marcillo et al. (2014) is destined for the accurate estimation of the atmospheric event origin at local, regional and global scales by the seismic and infrasonic networks and arrays. The BISL is based on probabilistic models of the source-station infrasonic signal propagation time, picking time and azimuth estimate merged with a prior knowledge about celerity distribution. It requires at each hypothetical source location, integration of the product of the corresponding source-station likelihood functions multiplied by a prior probability density function of celerity over the multivariate parameter space. The present BISL realization is generally time-consuming procedure based on numerical integration. The computational scheme proposed simplifies the target function so that integrals are taken exactly and are represented via standard functions. This makes the procedure much faster and realizable in real-time without practical loss of accuracy. The procedure executed as PYTHON-FORTRAN code demonstrates high performance on a set of the model and real data.

  14. Feature discrimination/identification based upon SAR return variations

    NASA Technical Reports Server (NTRS)

    Rasco, W. A., Sr.; Pietsch, R.

    1978-01-01

    A study of the statistics of The look-to-look variation statistics in the returns recorded in-flight by a digital, realtime SAR system are analyzed. The determination that the variations in the look-to-look returns from different classes do carry information content unique to the classes was illustrated by a model based on four variants derived from four look in-flight SAR data under study. The model was limited to four classes of returns: mowed grass on a athletic field, rough unmowed grass and weeds on a large vacant field, young fruit trees in a large orchard, and metal mobile homes and storage buildings in a large mobile home park. The data population in excess of 1000 returns represented over 250 individual pixels from the four classes. The multivariant discriminant model operated on the set of returns for each pixel and assigned that pixel to one of the four classes, based on the target variants and the probability distribution function of the four variants for each class.

  15. Method of identifying clusters representing statistical dependencies in multivariate data

    NASA Technical Reports Server (NTRS)

    Borucki, W. J.; Card, D. H.; Lyle, G. C.

    1975-01-01

    Approach is first to cluster and then to compute spatial boundaries for resulting clusters. Next step is to compute, from set of Monte Carlo samples obtained from scrambled data, estimates of probabilities of obtaining at least as many points within boundaries as were actually observed in original data.

  16. Modeling an Outbreak of Anthrax

    ERIC Educational Resources Information Center

    Sturdivant, Rod; Watts, Krista

    2010-01-01

    This article presents material that has been used as a classroom activity in a calculus-based probability and statistics course. The application was used in the first few lessons of this course. Students had three previous semesters of math, including calculus (single and multivariable), differential equations, and a course in mathematical…

  17. A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses

    ERIC Educational Resources Information Center

    Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini

    2012-01-01

    The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…

  18. What Can Quantum Optics Say about Computational Complexity Theory?

    NASA Astrophysics Data System (ADS)

    Rahimi-Keshari, Saleh; Lund, Austin P.; Ralph, Timothy C.

    2015-02-01

    Considering the problem of sampling from the output photon-counting probability distribution of a linear-optical network for input Gaussian states, we obtain results that are of interest from both quantum theory and the computational complexity theory point of view. We derive a general formula for calculating the output probabilities, and by considering input thermal states, we show that the output probabilities are proportional to permanents of positive-semidefinite Hermitian matrices. It is believed that approximating permanents of complex matrices in general is a #P-hard problem. However, we show that these permanents can be approximated with an algorithm in the BPPNP complexity class, as there exists an efficient classical algorithm for sampling from the output probability distribution. We further consider input squeezed-vacuum states and discuss the complexity of sampling from the probability distribution at the output.

  19. Vacuum quantum stress tensor fluctuations: A diagonalization approach

    NASA Astrophysics Data System (ADS)

    Schiappacasse, Enrico D.; Fewster, Christopher J.; Ford, L. H.

    2018-01-01

    Large vacuum fluctuations of a quantum stress tensor can be described by the asymptotic behavior of its probability distribution. Here we focus on stress tensor operators which have been averaged with a sampling function in time. The Minkowski vacuum state is not an eigenstate of the time-averaged operator, but can be expanded in terms of its eigenstates. We calculate the probability distribution and the cumulative probability distribution for obtaining a given value in a measurement of the time-averaged operator taken in the vacuum state. In these calculations, we study a specific operator that contributes to the stress-energy tensor of a massless scalar field in Minkowski spacetime, namely, the normal ordered square of the time derivative of the field. We analyze the rate of decrease of the tail of the probability distribution for different temporal sampling functions, such as compactly supported functions and the Lorentzian function. We find that the tails decrease relatively slowly, as exponentials of fractional powers, in agreement with previous work using the moments of the distribution. Our results lend additional support to the conclusion that large vacuum stress tensor fluctuations are more probable than large thermal fluctuations, and may have observable effects.

  20. Measurements of gas hydrate formation probability distributions on a quasi-free water droplet

    NASA Astrophysics Data System (ADS)

    Maeda, Nobuo

    2014-06-01

    A High Pressure Automated Lag Time Apparatus (HP-ALTA) can measure gas hydrate formation probability distributions from water in a glass sample cell. In an HP-ALTA gas hydrate formation originates near the edges of the sample cell and gas hydrate films subsequently grow across the water-guest gas interface. It would ideally be desirable to be able to measure gas hydrate formation probability distributions of a single water droplet or mist that is freely levitating in a guest gas, but this is technically challenging. The next best option is to let a water droplet sit on top of a denser, immiscible, inert, and wall-wetting hydrophobic liquid to avoid contact of a water droplet with the solid walls. Here we report the development of a second generation HP-ALTA which can measure gas hydrate formation probability distributions of a water droplet which sits on a perfluorocarbon oil in a container that is coated with 1H,1H,2H,2H-Perfluorodecyltriethoxysilane. It was found that the gas hydrate formation probability distributions of such a quasi-free water droplet were significantly lower than those of water in a glass sample cell.

  1. Dose response explorer: an integrated open-source tool for exploring and modelling radiotherapy dose volume outcome relationships

    NASA Astrophysics Data System (ADS)

    El Naqa, I.; Suneja, G.; Lindsay, P. E.; Hope, A. J.; Alaly, J. R.; Vicic, M.; Bradley, J. D.; Apte, A.; Deasy, J. O.

    2006-11-01

    Radiotherapy treatment outcome models are a complicated function of treatment, clinical and biological factors. Our objective is to provide clinicians and scientists with an accurate, flexible and user-friendly software tool to explore radiotherapy outcomes data and build statistical tumour control or normal tissue complications models. The software tool, called the dose response explorer system (DREES), is based on Matlab, and uses a named-field structure array data type. DREES/Matlab in combination with another open-source tool (CERR) provides an environment for analysing treatment outcomes. DREES provides many radiotherapy outcome modelling features, including (1) fitting of analytical normal tissue complication probability (NTCP) and tumour control probability (TCP) models, (2) combined modelling of multiple dose-volume variables (e.g., mean dose, max dose, etc) and clinical factors (age, gender, stage, etc) using multi-term regression modelling, (3) manual or automated selection of logistic or actuarial model variables using bootstrap statistical resampling, (4) estimation of uncertainty in model parameters, (5) performance assessment of univariate and multivariate analyses using Spearman's rank correlation and chi-square statistics, boxplots, nomograms, Kaplan-Meier survival plots, and receiver operating characteristics curves, and (6) graphical capabilities to visualize NTCP or TCP prediction versus selected variable models using various plots. DREES provides clinical researchers with a tool customized for radiotherapy outcome modelling. DREES is freely distributed. We expect to continue developing DREES based on user feedback.

  2. Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

    NASA Astrophysics Data System (ADS)

    Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

    2018-03-01

    This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

  3. Fragment size distribution in viscous bag breakup of a drop

    NASA Astrophysics Data System (ADS)

    Kulkarni, Varun; Bulusu, Kartik V.; Plesniak, Michael W.; Sojka, Paul E.

    2015-11-01

    In this study we examine the drop size distribution resulting from the fragmentation of a single drop in the presence of a continuous air jet. Specifically, we study the effect of Weber number, We, and Ohnesorge number, Oh on the disintegration process. The regime of breakup considered is observed between 12 <= We <= 16 for Oh <= 0.1. Experiments are conducted using phase Doppler anemometry. Both the number and volume fragment size probability distributions are plotted. The volume probability distribution revealed a bi-modal behavior with two distinct peaks: one corresponding to the rim fragments and the other to the bag fragments. This behavior was suppressed in the number probability distribution. Additionally, we employ an in-house particle detection code to isolate the rim fragment size distribution from the total probability distributions. Our experiments showed that the bag fragments are smaller in diameter and larger in number, while the rim fragments are larger in diameter and smaller in number. Furthermore, with increasing We for a given Ohwe observe a large number of small-diameter drops and small number of large-diameter drops. On the other hand, with increasing Oh for a fixed We the opposite is seen.

  4. Nuclear risk analysis of the Ulysses mission

    NASA Astrophysics Data System (ADS)

    Bartram, Bart W.; Vaughan, Frank R.; Englehart, Richard W.

    An account is given of the method used to quantify the risks accruing to the use of a radioisotope thermoelectric generator fueled by Pu-238 dioxide aboard the Space Shuttle-launched Ulysses mission. After using a Monte Carlo technique to develop probability distributions for the radiological consequences of a range of accident scenarios throughout the mission, factors affecting those consequences are identified in conjunction with their probability distributions. The functional relationship among all the factors is then established, and probability distributions for all factor effects are combined by means of a Monte Carlo technique.

  5. Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Park, Trevor

    2017-01-01

    A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…

  6. Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques

    NASA Technical Reports Server (NTRS)

    McDonald, G.; Storrie-Lombardi, M.; Nealson, K.

    1999-01-01

    The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.

  7. Score distributions of gapped multiple sequence alignments down to the low-probability tail

    NASA Astrophysics Data System (ADS)

    Fieth, Pascal; Hartmann, Alexander K.

    2016-08-01

    Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  8. Safe Onboard Guidance and Control Under Probabilistic Uncertainty

    NASA Technical Reports Server (NTRS)

    Blackmore, Lars James

    2011-01-01

    An algorithm was developed that determines the fuel-optimal spacecraft guidance trajectory that takes into account uncertainty, in order to guarantee that mission safety constraints are satisfied with the required probability. The algorithm uses convex optimization to solve for the optimal trajectory. Convex optimization is amenable to onboard solution due to its excellent convergence properties. The algorithm is novel because, unlike prior approaches, it does not require time-consuming evaluation of multivariate probability densities. Instead, it uses a new mathematical bounding approach to ensure that probability constraints are satisfied, and it is shown that the resulting optimization is convex. Empirical results show that the approach is many orders of magnitude less conservative than existing set conversion techniques, for a small penalty in computation time.

  9. Historical and future drought in Bangladesh using copula-based bivariate regional frequency analysis

    NASA Astrophysics Data System (ADS)

    Mortuza, Md Rubayet; Moges, Edom; Demissie, Yonas; Li, Hong-Yi

    2018-02-01

    The study aims at regional and probabilistic evaluation of bivariate drought characteristics to assess both the past and future drought duration and severity in Bangladesh. The procedures involve applying (1) standardized precipitation index to identify drought duration and severity, (2) regional frequency analysis to determine the appropriate marginal distributions for both duration and severity, (3) copula model to estimate the joint probability distribution of drought duration and severity, and (4) precipitation projections from multiple climate models to assess future drought trends. Since drought duration and severity in Bangladesh are often strongly correlated and do not follow same marginal distributions, the joint and conditional return periods of droughts are characterized using the copula-based joint distribution. The country is divided into three homogeneous regions using Fuzzy clustering and multivariate discordancy and homogeneity measures. For given severity and duration values, the joint return periods for a drought to exceed both values are on average 45% larger, while to exceed either value are 40% less than the return periods from the univariate frequency analysis, which treats drought duration and severity independently. These suggest that compared to the bivariate drought frequency analysis, the standard univariate frequency analysis under/overestimate the frequency and severity of droughts depending on how their duration and severity are related. Overall, more frequent and severe droughts are observed in the west side of the country. Future drought trend based on four climate models and two scenarios showed the possibility of less frequent drought in the future (2020-2100) than in the past (1961-2010).

  10. Correlation of spatial climate/weather maps and the advantages of using the Mahalanobis metric in predictions

    NASA Astrophysics Data System (ADS)

    Stephenson, D. B.

    1997-10-01

    The skill in predicting spatially varying weather/climate maps depends on the definition of the measure of similarity between the maps. Under the justifiable approximation that the anomaly maps are distributed multinormally, it is shown analytically that the choice of weighting metric, used in defining the anomaly correlation between spatial maps, can change the resulting probability distribution of the correlation coefficient. The estimate of the numbers of degrees of freedom based on the variance of the correlation distribution can vary from unity up to the number of grid points depending on the choice of weighting metric. The (pseudo-) inverse of the sample covariance matrix acts as a special choice for the metric in that it gives a correlation distribution which has minimal kurtosis and maximum dimension. Minimal kurtosis suggests that the average predictive skill might be improved due to the rarer occurrence of troublesome outlier patterns far from the mean state. Maximum dimension has a disadvantage for analogue prediction schemes in that it gives the minimum number of analogue states. This metric also has an advantage in that it allows one to powerfully test the null hypothesis of multinormality by examining the second and third moments of the correlation coefficient which were introduced by Mardia as invariant measures of multivariate kurtosis and skewness. For these reasons, it is suggested that this metric could be usefully employed in the prediction of weather/climate and in fingerprinting anthropogenic climate change. The ideas are illustrated using the bivariate example of the observed monthly mean sea-level pressures at Darwin and Tahitifrom 1866 1995.

  11. Site occupancy models with heterogeneous detection probabilities

    USGS Publications Warehouse

    Royle, J. Andrew

    2006-01-01

    Models for estimating the probability of occurrence of a species in the presence of imperfect detection are important in many ecological disciplines. In these ?site occupancy? models, the possibility of heterogeneity in detection probabilities among sites must be considered because variation in abundance (and other factors) among sampled sites induces variation in detection probability (p). In this article, I develop occurrence probability models that allow for heterogeneous detection probabilities by considering several common classes of mixture distributions for p. For any mixing distribution, the likelihood has the general form of a zero-inflated binomial mixture for which inference based upon integrated likelihood is straightforward. A recent paper by Link (2003, Biometrics 59, 1123?1130) demonstrates that in closed population models used for estimating population size, different classes of mixture distributions are indistinguishable from data, yet can produce very different inferences about population size. I demonstrate that this problem can also arise in models for estimating site occupancy in the presence of heterogeneous detection probabilities. The implications of this are discussed in the context of an application to avian survey data and the development of animal monitoring programs.

  12. Modeling highway travel time distribution with conditional probability models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oliveira Neto, Francisco Moraes; Chin, Shih-Miao; Hwang, Ho-Ling

    ABSTRACT Under the sponsorship of the Federal Highway Administration's Office of Freight Management and Operations, the American Transportation Research Institute (ATRI) has developed performance measures through the Freight Performance Measures (FPM) initiative. Under this program, travel speed information is derived from data collected using wireless based global positioning systems. These telemetric data systems are subscribed and used by trucking industry as an operations management tool. More than one telemetric operator submits their data dumps to ATRI on a regular basis. Each data transmission contains truck location, its travel time, and a clock time/date stamp. Data from the FPM program providesmore » a unique opportunity for studying the upstream-downstream speed distributions at different locations, as well as different time of the day and day of the week. This research is focused on the stochastic nature of successive link travel speed data on the continental United States Interstates network. Specifically, a method to estimate route probability distributions of travel time is proposed. This method uses the concepts of convolution of probability distributions and bivariate, link-to-link, conditional probability to estimate the expected distributions for the route travel time. Major contribution of this study is the consideration of speed correlation between upstream and downstream contiguous Interstate segments through conditional probability. The established conditional probability distributions, between successive segments, can be used to provide travel time reliability measures. This study also suggests an adaptive method for calculating and updating route travel time distribution as new data or information is added. This methodology can be useful to estimate performance measures as required by the recent Moving Ahead for Progress in the 21st Century Act (MAP 21).« less

  13. Probability of success for phase III after exploratory biomarker analysis in phase II.

    PubMed

    Götte, Heiko; Kirchner, Marietta; Sailer, Martin Oliver

    2017-05-01

    The probability of success or average power describes the potential of a future trial by weighting the power with a probability distribution of the treatment effect. The treatment effect estimate from a previous trial can be used to define such a distribution. During the development of targeted therapies, it is common practice to look for predictive biomarkers. The consequence is that the trial population for phase III is often selected on the basis of the most extreme result from phase II biomarker subgroup analyses. In such a case, there is a tendency to overestimate the treatment effect. We investigate whether the overestimation of the treatment effect estimate from phase II is transformed into a positive bias for the probability of success for phase III. We simulate a phase II/III development program for targeted therapies. This simulation allows to investigate selection probabilities and allows to compare the estimated with the true probability of success. We consider the estimated probability of success with and without subgroup selection. Depending on the true treatment effects, there is a negative bias without selection because of the weighting by the phase II distribution. In comparison, selection increases the estimated probability of success. Thus, selection does not lead to a bias in probability of success if underestimation due to the phase II distribution and overestimation due to selection cancel each other out. We recommend to perform similar simulations in practice to get the necessary information about the risk and chances associated with such subgroup selection designs. Copyright © 2017 John Wiley & Sons, Ltd.

  14. General formulation of long-range degree correlations in complex networks

    NASA Astrophysics Data System (ADS)

    Fujiki, Yuka; Takaguchi, Taro; Yakubo, Kousuke

    2018-06-01

    We provide a general framework for analyzing degree correlations between nodes separated by more than one step (i.e., beyond nearest neighbors) in complex networks. One joint and four conditional probability distributions are introduced to fully describe long-range degree correlations with respect to degrees k and k' of two nodes and shortest path length l between them. We present general relations among these probability distributions and clarify the relevance to nearest-neighbor degree correlations. Unlike nearest-neighbor correlations, some of these probability distributions are meaningful only in finite-size networks. Furthermore, as a baseline to determine the existence of intrinsic long-range degree correlations in a network other than inevitable correlations caused by the finite-size effect, the functional forms of these probability distributions for random networks are analytically evaluated within a mean-field approximation. The utility of our argument is demonstrated by applying it to real-world networks.

  15. Stochastic analysis of particle movement over a dune bed

    USGS Publications Warehouse

    Lee, Baum K.; Jobson, Harvey E.

    1977-01-01

    Stochastic models are available that can be used to predict the transport and dispersion of bed-material sediment particles in an alluvial channel. These models are based on the proposition that the movement of a single bed-material sediment particle consists of a series of steps of random length separated by rest periods of random duration and, therefore, application of the models requires a knowledge of the probability distributions of the step lengths, the rest periods, the elevation of particle deposition, and the elevation of particle erosion. The procedure was tested by determining distributions from bed profiles formed in a large laboratory flume with a coarse sand as the bed material. The elevation of particle deposition and the elevation of particle erosion can be considered to be identically distributed, and their distribution can be described by either a ' truncated Gaussian ' or a ' triangular ' density function. The conditional probability distribution of the rest period given the elevation of particle deposition closely followed the two-parameter gamma distribution. The conditional probability distribution of the step length given the elevation of particle erosion and the elevation of particle deposition also closely followed the two-parameter gamma density function. For a given flow, the scale and shape parameters describing the gamma probability distributions can be expressed as functions of bed-elevation. (Woodard-USGS)

  16. Using type IV Pearson distribution to calculate the probabilities of underrun and overrun of lists of multiple cases.

    PubMed

    Wang, Jihan; Yang, Kai

    2014-07-01

    An efficient operating room needs both little underutilised and overutilised time to achieve optimal cost efficiency. The probabilities of underrun and overrun of lists of cases can be estimated by a well defined duration distribution of the lists. To propose a method of predicting the probabilities of underrun and overrun of lists of cases using Type IV Pearson distribution to support case scheduling. Six years of data were collected. The first 5 years of data were used to fit distributions and estimate parameters. The data from the last year were used as testing data to validate the proposed methods. The percentiles of the duration distribution of lists of cases were calculated by Type IV Pearson distribution and t-distribution. Monte Carlo simulation was conducted to verify the accuracy of percentiles defined by the proposed methods. Operating rooms in John D. Dingell VA Medical Center, United States, from January 2005 to December 2011. Differences between the proportion of lists of cases that were completed within the percentiles of the proposed duration distribution of the lists and the corresponding percentiles. Compared with the t-distribution, the proposed new distribution is 8.31% (0.38) more accurate on average and 14.16% (0.19) more accurate in calculating the probabilities at the 10th and 90th percentiles of the distribution, which is a major concern of operating room schedulers. The absolute deviations between the percentiles defined by Type IV Pearson distribution and those from Monte Carlo simulation varied from 0.20  min (0.01) to 0.43  min (0.03). Operating room schedulers can rely on the most recent 10 cases with the same combination of surgeon and procedure(s) for distribution parameter estimation to plan lists of cases. Values are mean (SEM). The proposed Type IV Pearson distribution is more accurate than t-distribution to estimate the probabilities of underrun and overrun of lists of cases. However, as not all the individual case durations followed log-normal distributions, there was some deviation from the true duration distribution of the lists.

  17. Forecasts of non-Gaussian parameter spaces using Box-Cox transformations

    NASA Astrophysics Data System (ADS)

    Joachimi, B.; Taylor, A. N.

    2011-09-01

    Forecasts of statistical constraints on model parameters using the Fisher matrix abound in many fields of astrophysics. The Fisher matrix formalism involves the assumption of Gaussianity in parameter space and hence fails to predict complex features of posterior probability distributions. Combining the standard Fisher matrix with Box-Cox transformations, we propose a novel method that accurately predicts arbitrary posterior shapes. The Box-Cox transformations are applied to parameter space to render it approximately multivariate Gaussian, performing the Fisher matrix calculation on the transformed parameters. We demonstrate that, after the Box-Cox parameters have been determined from an initial likelihood evaluation, the method correctly predicts changes in the posterior when varying various parameters of the experimental setup and the data analysis, with marginally higher computational cost than a standard Fisher matrix calculation. We apply the Box-Cox-Fisher formalism to forecast cosmological parameter constraints by future weak gravitational lensing surveys. The characteristic non-linear degeneracy between matter density parameter and normalization of matter density fluctuations is reproduced for several cases, and the capabilities of breaking this degeneracy by weak-lensing three-point statistics is investigated. Possible applications of Box-Cox transformations of posterior distributions are discussed, including the prospects for performing statistical data analysis steps in the transformed Gaussianized parameter space.

  18. Generalized t-statistic for two-group classification.

    PubMed

    Komori, Osamu; Eguchi, Shinto; Copas, John B

    2015-06-01

    In the classic discriminant model of two multivariate normal distributions with equal variance matrices, the linear discriminant function is optimal both in terms of the log likelihood ratio and in terms of maximizing the standardized difference (the t-statistic) between the means of the two distributions. In a typical case-control study, normality may be sensible for the control sample but heterogeneity and uncertainty in diagnosis may suggest that a more flexible model is needed for the cases. We generalize the t-statistic approach by finding the linear function which maximizes a standardized difference but with data from one of the groups (the cases) filtered by a possibly nonlinear function U. We study conditions for consistency of the method and find the function U which is optimal in the sense of asymptotic efficiency. Optimality may also extend to other measures of discriminatory efficiency such as the area under the receiver operating characteristic curve. The optimal function U depends on a scalar probability density function which can be estimated non-parametrically using a standard numerical algorithm. A lasso-like version for variable selection is implemented by adding L1-regularization to the generalized t-statistic. Two microarray data sets in the study of asthma and various cancers are used as motivating examples. © 2014, The International Biometric Society.

  19. Parameterizing microphysical effects on variances and covariances of moisture and heat content using a multivariate probability density function: a study with CLUBB (tag MVCS)

    DOE PAGES

    Griffin, Brian M.; Larson, Vincent E.

    2016-11-25

    Microphysical processes, such as the formation, growth, and evaporation of precipitation, interact with variability and covariances (e.g., fluxes) in moisture and heat content. For instance, evaporation of rain may produce cold pools, which in turn may trigger fresh convection and precipitation. These effects are usually omitted or else crudely parameterized at subgrid scales in weather and climate models.A more formal approach is pursued here, based on predictive, horizontally averaged equations for the variances, covariances, and fluxes of moisture and heat content. These higher-order moment equations contain microphysical source terms. The microphysics terms can be integrated analytically, given a suitably simplemore » warm-rain microphysics scheme and an approximate assumption about the multivariate distribution of cloud-related and precipitation-related variables. Performing the integrations provides exact expressions within an idealized context.A large-eddy simulation (LES) of a shallow precipitating cumulus case is performed here, and it indicates that the microphysical effects on (co)variances and fluxes can be large. In some budgets and altitude ranges, they are dominant terms. The analytic expressions for the integrals are implemented in a single-column, higher-order closure model. Interactive single-column simulations agree qualitatively with the LES. The analytic integrations form a parameterization of microphysical effects in their own right, and they also serve as benchmark solutions that can be compared to non-analytic integration methods.« less

  20. A tool for simulating collision probabilities of animals with marine renewable energy devices.

    PubMed

    Schmitt, Pál; Culloch, Ross; Lieber, Lilian; Molander, Sverker; Hammar, Linus; Kregting, Louise

    2017-01-01

    The mathematical problem of establishing a collision probability distribution is often not trivial. The shape and motion of the animal as well as of the the device must be evaluated in a four-dimensional space (3D motion over time). Earlier work on wind and tidal turbines was limited to a simplified two-dimensional representation, which cannot be applied to many new structures. We present a numerical algorithm to obtain such probability distributions using transient, three-dimensional numerical simulations. The method is demonstrated using a sub-surface tidal kite as an example. Necessary pre- and post-processing of the data created by the model is explained, numerical details and potential issues and limitations in the application of resulting probability distributions are highlighted.

  1. Lognormal Approximations of Fault Tree Uncertainty Distributions.

    PubMed

    El-Shanawany, Ashraf Ben; Ardron, Keith H; Walker, Simon P

    2018-01-26

    Fault trees are used in reliability modeling to create logical models of fault combinations that can lead to undesirable events. The output of a fault tree analysis (the top event probability) is expressed in terms of the failure probabilities of basic events that are input to the model. Typically, the basic event probabilities are not known exactly, but are modeled as probability distributions: therefore, the top event probability is also represented as an uncertainty distribution. Monte Carlo methods are generally used for evaluating the uncertainty distribution, but such calculations are computationally intensive and do not readily reveal the dominant contributors to the uncertainty. In this article, a closed-form approximation for the fault tree top event uncertainty distribution is developed, which is applicable when the uncertainties in the basic events of the model are lognormally distributed. The results of the approximate method are compared with results from two sampling-based methods: namely, the Monte Carlo method and the Wilks method based on order statistics. It is shown that the closed-form expression can provide a reasonable approximation to results obtained by Monte Carlo sampling, without incurring the computational expense. The Wilks method is found to be a useful means of providing an upper bound for the percentiles of the uncertainty distribution while being computationally inexpensive compared with full Monte Carlo sampling. The lognormal approximation method and Wilks's method appear attractive, practical alternatives for the evaluation of uncertainty in the output of fault trees and similar multilinear models. © 2018 Society for Risk Analysis.

  2. Quantum key distribution without the wavefunction

    NASA Astrophysics Data System (ADS)

    Niestegge, Gerd

    A well-known feature of quantum mechanics is the secure exchange of secret bit strings which can then be used as keys to encrypt messages transmitted over any classical communication channel. It is demonstrated that this quantum key distribution allows a much more general and abstract access than commonly thought. The results include some generalizations of the Hilbert space version of quantum key distribution, but are based upon a general nonclassical extension of conditional probability. A special state-independent conditional probability is identified as origin of the superior security of quantum key distribution; this is a purely algebraic property of the quantum logic and represents the transition probability between the outcomes of two consecutive quantum measurements.

  3. The complexity of divisibility.

    PubMed

    Bausch, Johannes; Cubitt, Toby

    2016-09-01

    We address two sets of long-standing open questions in linear algebra and probability theory, from a computational complexity perspective: stochastic matrix divisibility, and divisibility and decomposability of probability distributions. We prove that finite divisibility of stochastic matrices is an NP-complete problem, and extend this result to nonnegative matrices, and completely-positive trace-preserving maps, i.e. the quantum analogue of stochastic matrices. We further prove a complexity hierarchy for the divisibility and decomposability of probability distributions, showing that finite distribution divisibility is in P, but decomposability is NP-hard. For the former, we give an explicit polynomial-time algorithm. All results on distributions extend to weak-membership formulations, proving that the complexity of these problems is robust to perturbations.

  4. Microbial Performance of Food Safety Control and Assurance Activities in a Fresh Produce Processing Sector Measured Using a Microbial Assessment Scheme and Statistical Modeling.

    PubMed

    Njage, Patrick Murigu Kamau; Sawe, Chemutai Tonui; Onyango, Cecilia Moraa; Habib, I; Njagi, Edmund Njeru; Aerts, Marc; Molenberghs, Geert

    2017-01-01

    Current approaches such as inspections, audits, and end product testing cannot detect the distribution and dynamics of microbial contamination. Despite the implementation of current food safety management systems, foodborne outbreaks linked to fresh produce continue to be reported. A microbial assessment scheme and statistical modeling were used to systematically assess the microbial performance of core control and assurance activities in five Kenyan fresh produce processing and export companies. Generalized linear mixed models and correlated random-effects joint models for multivariate clustered data followed by empirical Bayes estimates enabled the analysis of the probability of contamination across critical sampling locations (CSLs) and factories as a random effect. Salmonella spp. and Listeria monocytogenes were not detected in the final products. However, none of the processors attained the maximum safety level for environmental samples. Escherichia coli was detected in five of the six CSLs, including the final product. Among the processing-environment samples, the hand or glove swabs of personnel revealed a higher level of predicted contamination with E. coli , and 80% of the factories were E. coli positive at this CSL. End products showed higher predicted probabilities of having the lowest level of food safety compared with raw materials. The final products were E. coli positive despite the raw materials being E. coli negative for 60% of the processors. There was a higher probability of contamination with coliforms in water at the inlet than in the final rinse water. Four (80%) of the five assessed processors had poor to unacceptable counts of Enterobacteriaceae on processing surfaces. Personnel-, equipment-, and product-related hygiene measures to improve the performance of preventive and intervention measures are recommended.

  5. Probability distributions of hydraulic conductivity for the hydrogeologic units of the Death Valley regional ground-water flow system, Nevada and California

    USGS Publications Warehouse

    Belcher, Wayne R.; Sweetkind, Donald S.; Elliott, Peggy E.

    2002-01-01

    The use of geologic information such as lithology and rock properties is important to constrain conceptual and numerical hydrogeologic models. This geologic information is difficult to apply explicitly to numerical modeling and analyses because it tends to be qualitative rather than quantitative. This study uses a compilation of hydraulic-conductivity measurements to derive estimates of the probability distributions for several hydrogeologic units within the Death Valley regional ground-water flow system, a geologically and hydrologically complex region underlain by basin-fill sediments, volcanic, intrusive, sedimentary, and metamorphic rocks. Probability distributions of hydraulic conductivity for general rock types have been studied previously; however, this study provides more detailed definition of hydrogeologic units based on lithostratigraphy, lithology, alteration, and fracturing and compares the probability distributions to the aquifer test data. Results suggest that these probability distributions can be used for studies involving, for example, numerical flow modeling, recharge, evapotranspiration, and rainfall runoff. These probability distributions can be used for such studies involving the hydrogeologic units in the region, as well as for similar rock types elsewhere. Within the study area, fracturing appears to have the greatest influence on the hydraulic conductivity of carbonate bedrock hydrogeologic units. Similar to earlier studies, we find that alteration and welding in the Tertiary volcanic rocks greatly influence hydraulic conductivity. As alteration increases, hydraulic conductivity tends to decrease. Increasing degrees of welding appears to increase hydraulic conductivity because welding increases the brittleness of the volcanic rocks, thus increasing the amount of fracturing.

  6. Conducting Privacy-Preserving Multivariable Propensity Score Analysis When Patient Covariate Information Is Stored in Separate Locations.

    PubMed

    Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian

    2017-03-15

    Distributed networks of health-care data sources are increasingly being utilized to conduct pharmacoepidemiologic database studies. Such networks may contain data that are not physically pooled but instead are distributed horizontally (separate patients within each data source) or vertically (separate measures within each data source) in order to preserve patient privacy. While multivariable methods for the analysis of horizontally distributed data are frequently employed, few practical approaches have been put forth to deal with vertically distributed health-care databases. In this paper, we propose 2 propensity score-based approaches to vertically distributed data analysis and test their performance using 5 example studies. We found that these approaches produced point estimates close to what could be achieved without partitioning. We further found a performance benefit (i.e., lower mean squared error) for sequentially passing a propensity score through each data domain (called the "sequential approach") as compared with fitting separate domain-specific propensity scores (called the "parallel approach"). These results were validated in a small simulation study. This proof-of-concept study suggests a new multivariable analysis approach to vertically distributed health-care databases that is practical, preserves patient privacy, and warrants further investigation for use in clinical research applications that rely on health-care databases. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Output Feedback Stabilization for a Class of Multi-Variable Bilinear Stochastic Systems with Stochastic Coupling Attenuation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Qichun; Zhou, Jinglin; Wang, Hong

    In this paper, stochastic coupling attenuation is investigated for a class of multi-variable bilinear stochastic systems and a novel output feedback m-block backstepping controller with linear estimator is designed, where gradient descent optimization is used to tune the design parameters of the controller. It has been shown that the trajectories of the closed-loop stochastic systems are bounded in probability sense and the stochastic coupling of the system outputs can be effectively attenuated by the proposed control algorithm. Moreover, the stability of the stochastic systems is analyzed and the effectiveness of the proposed method has been demonstrated using a simulated example.

  8. Bayesian alternative to the ISO-GUM's use of the Welch Satterthwaite formula

    NASA Astrophysics Data System (ADS)

    Kacker, Raghu N.

    2006-02-01

    In certain disciplines, uncertainty is traditionally expressed as an interval about an estimate for the value of the measurand. Development of such uncertainty intervals with a stated coverage probability based on the International Organization for Standardization (ISO) Guide to the Expression of Uncertainty in Measurement (GUM) requires a description of the probability distribution for the value of the measurand. The ISO-GUM propagates the estimates and their associated standard uncertainties for various input quantities through a linear approximation of the measurement equation to determine an estimate and its associated standard uncertainty for the value of the measurand. This procedure does not yield a probability distribution for the value of the measurand. The ISO-GUM suggests that under certain conditions motivated by the central limit theorem the distribution for the value of the measurand may be approximated by a scaled-and-shifted t-distribution with effective degrees of freedom obtained from the Welch-Satterthwaite (W-S) formula. The approximate t-distribution may then be used to develop an uncertainty interval with a stated coverage probability for the value of the measurand. We propose an approximate normal distribution based on a Bayesian uncertainty as an alternative to the t-distribution based on the W-S formula. A benefit of the approximate normal distribution based on a Bayesian uncertainty is that it greatly simplifies the expression of uncertainty by eliminating altogether the need for calculating effective degrees of freedom from the W-S formula. In the special case where the measurand is the difference between two means, each evaluated from statistical analyses of independent normally distributed measurements with unknown and possibly unequal variances, the probability distribution for the value of the measurand is known to be a Behrens-Fisher distribution. We compare the performance of the approximate normal distribution based on a Bayesian uncertainty and the approximate t-distribution based on the W-S formula with respect to the Behrens-Fisher distribution. The approximate normal distribution is simpler and better in this case. A thorough investigation of the relative performance of the two approximate distributions would require comparison for a range of measurement equations by numerical methods.

  9. Does Litter Size Variation Affect Models of Terrestrial Carnivore Extinction Risk and Management?

    PubMed Central

    Devenish-Nelson, Eleanor S.; Stephens, Philip A.; Harris, Stephen; Soulsbury, Carl; Richards, Shane A.

    2013-01-01

    Background Individual variation in both survival and reproduction has the potential to influence extinction risk. Especially for rare or threatened species, reliable population models should adequately incorporate demographic uncertainty. Here, we focus on an important form of demographic stochasticity: variation in litter sizes. We use terrestrial carnivores as an example taxon, as they are frequently threatened or of economic importance. Since data on intraspecific litter size variation are often sparse, it is unclear what probability distribution should be used to describe the pattern of litter size variation for multiparous carnivores. Methodology/Principal Findings We used litter size data on 32 terrestrial carnivore species to test the fit of 12 probability distributions. The influence of these distributions on quasi-extinction probabilities and the probability of successful disease control was then examined for three canid species – the island fox Urocyon littoralis, the red fox Vulpes vulpes, and the African wild dog Lycaon pictus. Best fitting probability distributions differed among the carnivores examined. However, the discretised normal distribution provided the best fit for the majority of species, because variation among litter-sizes was often small. Importantly, however, the outcomes of demographic models were generally robust to the distribution used. Conclusion/Significance These results provide reassurance for those using demographic modelling for the management of less studied carnivores in which litter size variation is estimated using data from species with similar reproductive attributes. PMID:23469140

  10. Does litter size variation affect models of terrestrial carnivore extinction risk and management?

    PubMed

    Devenish-Nelson, Eleanor S; Stephens, Philip A; Harris, Stephen; Soulsbury, Carl; Richards, Shane A

    2013-01-01

    Individual variation in both survival and reproduction has the potential to influence extinction risk. Especially for rare or threatened species, reliable population models should adequately incorporate demographic uncertainty. Here, we focus on an important form of demographic stochasticity: variation in litter sizes. We use terrestrial carnivores as an example taxon, as they are frequently threatened or of economic importance. Since data on intraspecific litter size variation are often sparse, it is unclear what probability distribution should be used to describe the pattern of litter size variation for multiparous carnivores. We used litter size data on 32 terrestrial carnivore species to test the fit of 12 probability distributions. The influence of these distributions on quasi-extinction probabilities and the probability of successful disease control was then examined for three canid species - the island fox Urocyon littoralis, the red fox Vulpes vulpes, and the African wild dog Lycaon pictus. Best fitting probability distributions differed among the carnivores examined. However, the discretised normal distribution provided the best fit for the majority of species, because variation among litter-sizes was often small. Importantly, however, the outcomes of demographic models were generally robust to the distribution used. These results provide reassurance for those using demographic modelling for the management of less studied carnivores in which litter size variation is estimated using data from species with similar reproductive attributes.

  11. Landslide susceptibility map: from research to application

    NASA Astrophysics Data System (ADS)

    Fiorucci, Federica; Reichenbach, Paola; Ardizzone, Francesca; Rossi, Mauro; Felicioni, Giulia; Antonini, Guendalina

    2014-05-01

    Susceptibility map is an important and essential tool in environmental planning, to evaluate landslide hazard and risk and for a correct and responsible management of the territory. Landslide susceptibility is the likelihood of a landslide occurring in an area on the basis of local terrain conditions. Can be expressed as the probability that any given region will be affected by landslides, i.e. an estimate of "where" landslides are likely to occur. In this work we present two examples of landslide susceptibility map prepared for the Umbria Region and for the Perugia Municipality. These two maps were realized following official request from the Regional and Municipal government to the Research Institute for the Hydrogeological Protection (CNR-IRPI). The susceptibility map prepared for the Umbria Region represents the development of previous agreements focused to prepare: i) a landslide inventory map that was included in the Urban Territorial Planning (PUT) and ii) a series of maps for the Regional Plan for Multi-risk Prevention. The activities carried out for the Umbria Region were focused to define and apply methods and techniques for landslide susceptibility zonation. Susceptibility maps were prepared exploiting a multivariate statistical model (linear discriminant analysis) for the five Civil Protection Alert Zones defined in the regional territory. The five resulting maps were tested and validated using the spatial distribution of recent landslide events that occurred in the region. The susceptibility map for the Perugia Municipality was prepared to be integrated as one of the cartographic product in the Municipal development plan (PRG - Piano Regolatore Generale) as required by the existing legislation. At strategic level, one of the main objectives of the PRG, is to establish a framework of knowledge and legal aspects for the management of geo-hydrological risk. At national level most of the susceptibility maps prepared for the PRG, were and still are obtained qualitatively classifying the territory according to slope classes. For the Perugia Municipality the susceptibility map was obtained combining results of statistical multivariate models and landslide density map. In particular, in the first phase a susceptibility zonation was prepared using different single and combined probability statistical multivariate techniques. The zonation was then combined and compared with the landslide density map in order to reclassify the false negative (portion of the territory classified by the model as stable affected by slope failures). The semi-quantitative resulting map was classified in five susceptibility classes. For each class a set of technical regulation was established to manage the territory.

  12. Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model

    PubMed Central

    Mitra, Rajib; Jordan, Michael I.; Dunbrack, Roland L.

    2010-01-01

    Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp. PMID:20442867

  13. Probabilistic analysis of preload in the abutment screw of a dental implant complex.

    PubMed

    Guda, Teja; Ross, Thomas A; Lang, Lisa A; Millwater, Harry R

    2008-09-01

    Screw loosening is a problem for a percentage of implants. A probabilistic analysis to determine the cumulative probability distribution of the preload, the probability of obtaining an optimal preload, and the probabilistic sensitivities identifying important variables is lacking. The purpose of this study was to examine the inherent variability of material properties, surface interactions, and applied torque in an implant system to determine the probability of obtaining desired preload values and to identify the significant variables that affect the preload. Using software programs, an abutment screw was subjected to a tightening torque and the preload was determined from finite element (FE) analysis. The FE model was integrated with probabilistic analysis software. Two probabilistic analysis methods (advanced mean value and Monte Carlo sampling) were applied to determine the cumulative distribution function (CDF) of preload. The coefficient of friction, elastic moduli, Poisson's ratios, and applied torque were modeled as random variables and defined by probability distributions. Separate probability distributions were determined for the coefficient of friction in well-lubricated and dry environments. The probabilistic analyses were performed and the cumulative distribution of preload was determined for each environment. A distinct difference was seen between the preload probability distributions generated in a dry environment (normal distribution, mean (SD): 347 (61.9) N) compared to a well-lubricated environment (normal distribution, mean (SD): 616 (92.2) N). The probability of obtaining a preload value within the target range was approximately 54% for the well-lubricated environment and only 0.02% for the dry environment. The preload is predominately affected by the applied torque and coefficient of friction between the screw threads and implant bore at lower and middle values of the preload CDF, and by the applied torque and the elastic modulus of the abutment screw at high values of the preload CDF. Lubrication at the threaded surfaces between the abutment screw and implant bore affects the preload developed in the implant complex. For the well-lubricated surfaces, only approximately 50% of implants will have preload values within the generally accepted range. This probability can be improved by applying a higher torque than normally recommended or a more closely controlled torque than typically achieved. It is also suggested that materials with higher elastic moduli be used in the manufacture of the abutment screw to achieve a higher preload.

  14. Comparative analysis through probability distributions of a data set

    NASA Astrophysics Data System (ADS)

    Cristea, Gabriel; Constantinescu, Dan Mihai

    2018-02-01

    In practice, probability distributions are applied in such diverse fields as risk analysis, reliability engineering, chemical engineering, hydrology, image processing, physics, market research, business and economic research, customer support, medicine, sociology, demography etc. This article highlights important aspects of fitting probability distributions to data and applying the analysis results to make informed decisions. There are a number of statistical methods available which can help us to select the best fitting model. Some of the graphs display both input data and fitted distributions at the same time, as probability density and cumulative distribution. The goodness of fit tests can be used to determine whether a certain distribution is a good fit. The main used idea is to measure the "distance" between the data and the tested distribution, and compare that distance to some threshold values. Calculating the goodness of fit statistics also enables us to order the fitted distributions accordingly to how good they fit to data. This particular feature is very helpful for comparing the fitted models. The paper presents a comparison of most commonly used goodness of fit tests as: Kolmogorov-Smirnov, Anderson-Darling, and Chi-Squared. A large set of data is analyzed and conclusions are drawn by visualizing the data, comparing multiple fitted distributions and selecting the best model. These graphs should be viewed as an addition to the goodness of fit tests.

  15. Impact of temporal probability in 4D dose calculation for lung tumors.

    PubMed

    Rouabhi, Ouided; Ma, Mingyu; Bayouth, John; Xia, Junyi

    2015-11-08

    The purpose of this study was to evaluate the dosimetric uncertainty in 4D dose calculation using three temporal probability distributions: uniform distribution, sinusoidal distribution, and patient-specific distribution derived from the patient respiratory trace. Temporal probability, defined as the fraction of time a patient spends in each respiratory amplitude, was evaluated in nine lung cancer patients. Four-dimensional computed tomography (4D CT), along with deformable image registration, was used to compute 4D dose incorporating the patient's respiratory motion. First, the dose of each of 10 phase CTs was computed using the same planning parameters as those used in 3D treatment planning based on the breath-hold CT. Next, deformable image registration was used to deform the dose of each phase CT to the breath-hold CT using the deformation map between the phase CT and the breath-hold CT. Finally, the 4D dose was computed by summing the deformed phase doses using their corresponding temporal probabilities. In this study, 4D dose calculated from the patient-specific temporal probability distribution was used as the ground truth. The dosimetric evaluation matrix included: 1) 3D gamma analysis, 2) mean tumor dose (MTD), 3) mean lung dose (MLD), and 4) lung V20. For seven out of nine patients, both uniform and sinusoidal temporal probability dose distributions were found to have an average gamma passing rate > 95% for both the lung and PTV regions. Compared with 4D dose calculated using the patient respiratory trace, doses using uniform and sinusoidal distribution showed a percentage difference on average of -0.1% ± 0.6% and -0.2% ± 0.4% in MTD, -0.2% ± 1.9% and -0.2% ± 1.3% in MLD, 0.09% ± 2.8% and -0.07% ± 1.8% in lung V20, -0.1% ± 2.0% and 0.08% ± 1.34% in lung V10, 0.47% ± 1.8% and 0.19% ± 1.3% in lung V5, respectively. We concluded that four-dimensional dose computed using either a uniform or sinusoidal temporal probability distribution can approximate four-dimensional dose computed using the patient-specific respiratory trace.

  16. Scaling symmetry, renormalization, and time series modeling: the case of financial assets dynamics.

    PubMed

    Zamparo, Marco; Baldovin, Fulvio; Caraglio, Michele; Stella, Attilio L

    2013-12-01

    We present and discuss a stochastic model of financial assets dynamics based on the idea of an inverse renormalization group strategy. With this strategy we construct the multivariate distributions of elementary returns based on the scaling with time of the probability density of their aggregates. In its simplest version the model is the product of an endogenous autoregressive component and a random rescaling factor designed to embody also exogenous influences. Mathematical properties like increments' stationarity and ergodicity can be proven. Thanks to the relatively low number of parameters, model calibration can be conveniently based on a method of moments, as exemplified in the case of historical data of the S&P500 index. The calibrated model accounts very well for many stylized facts, like volatility clustering, power-law decay of the volatility autocorrelation function, and multiscaling with time of the aggregated return distribution. In agreement with empirical evidence in finance, the dynamics is not invariant under time reversal, and, with suitable generalizations, skewness of the return distribution and leverage effects can be included. The analytical tractability of the model opens interesting perspectives for applications, for instance, in terms of obtaining closed formulas for derivative pricing. Further important features are the possibility of making contact, in certain limits, with autoregressive models widely used in finance and the possibility of partially resolving the long- and short-memory components of the volatility, with consistent results when applied to historical series.

  17. Scaling symmetry, renormalization, and time series modeling: The case of financial assets dynamics

    NASA Astrophysics Data System (ADS)

    Zamparo, Marco; Baldovin, Fulvio; Caraglio, Michele; Stella, Attilio L.

    2013-12-01

    We present and discuss a stochastic model of financial assets dynamics based on the idea of an inverse renormalization group strategy. With this strategy we construct the multivariate distributions of elementary returns based on the scaling with time of the probability density of their aggregates. In its simplest version the model is the product of an endogenous autoregressive component and a random rescaling factor designed to embody also exogenous influences. Mathematical properties like increments’ stationarity and ergodicity can be proven. Thanks to the relatively low number of parameters, model calibration can be conveniently based on a method of moments, as exemplified in the case of historical data of the S&P500 index. The calibrated model accounts very well for many stylized facts, like volatility clustering, power-law decay of the volatility autocorrelation function, and multiscaling with time of the aggregated return distribution. In agreement with empirical evidence in finance, the dynamics is not invariant under time reversal, and, with suitable generalizations, skewness of the return distribution and leverage effects can be included. The analytical tractability of the model opens interesting perspectives for applications, for instance, in terms of obtaining closed formulas for derivative pricing. Further important features are the possibility of making contact, in certain limits, with autoregressive models widely used in finance and the possibility of partially resolving the long- and short-memory components of the volatility, with consistent results when applied to historical series.

  18. Positive mental health and well-being among a third level student population.

    PubMed

    Davoren, Martin P; Fitzgerald, Eimear; Shiely, Frances; Perry, Ivan J

    2013-01-01

    Much research on the health and well-being of third level students is focused on poor mental health leading to a dearth of information on positive mental health and well-being. Recently, the Warwick Edinburgh Mental Well-being scale (WEMWBS) was developed as a measurement of positive mental health and well-being. The aim of this research is to investigate the distribution and determinants of positive mental health and well-being in a large, broadly representative sample of third level students using WEMWBS. Undergraduate students from one large third level institution were sampled using probability proportional to size sampling. Questionnaires were distributed to students attending lectures in the randomly selected degrees. A total of 2,332 self-completed questionnaires were obtained, yielding a response rate of 51% based on students registered to relevant modules and 84% based on attendance. One-way ANOVAs and multivariate logistic regression were utilised to investigate factors associated with positive mental health and well-being. The sample was predominantly female (62.66%), in first year (46.9%) and living in their parents' house (42.4%) or in a rented house or flat (40.8%). In multivariate analysis adjusted for age and stratified by gender, no significant differences in WEMWBS score were observed by area of study, alcohol, smoking or drug use. WEMWBS scores were higher among male students with low levels of physical activity (p=0.04). Men and women reporting one or more sexual partners (p<0.001) were also more likely to report above average mental health and well-being. This is the first study to examine positive mental health and well-being scores in a third level student sample using WEMWBS. The findings suggest that students with a relatively adverse health and lifestyle profile have higher than average mental health and well-being. To confirm these results, this work needs to be replicated across other third level institutions.

  19. Probability of detecting atrazine/desethyl-atrazine and elevated concentrations of nitrate plus nitrate as nitrogen in ground water in the Idaho part of the western Snake River Plain

    USGS Publications Warehouse

    Donato, Mary M.

    2000-01-01

    As ground water continues to provide an ever-growing proportion of Idaho?s drinking water, concerns about the quality of that resource are increasing. Pesticides (most commonly, atrazine/desethyl-atrazine, hereafter referred to as atrazine) and nitrite plus nitrate as nitrogen (hereafter referred to as nitrate) have been detected in many aquifers in the State. To provide a sound hydrogeologic basis for atrazine and nitrate management in southern Idaho—the largest region of land and water use in the State—the U.S. Geological Survey produced maps showing the probability of detecting these contaminants in ground water in the upper Snake River Basin (published in a 1998 report) and the western Snake River Plain (published in this report). The atrazine probability map for the western Snake River Plain was constructed by overlaying ground-water quality data with hydrogeologic and anthropogenic data in a geographic information system (GIS). A data set was produced in which each well had corresponding information on land use, geology, precipitation, soil characteristics, regional depth to ground water, well depth, water level, and atrazine use. These data were analyzed by logistic regression using a statistical software package. Several preliminary multivariate models were developed and those that best predicted the detection of atrazine were selected. The multivariate models then were entered into a GIS and the probability maps were produced. Land use, precipitation, soil hydrologic group, and well depth were significantly correlated with atrazine detections in the western Snake River Plain. These variables also were important in the 1998 probability study of the upper Snake River Basin. The effectiveness of the probability models for atrazine might be improved if more detailed data were available for atrazine application. A preliminary atrazine probability map for the entire Snake River Plain in Idaho, based on a data set representing that region, also was produced. In areas where this map overlaps the 1998 map of the upper Snake River Basin, the two maps show broadly similar probabilities of detecting atrazine. Logistic regression also was used to develop a preliminary statistical model that predicts the probability of detecting elevated nitrate in the western Snake River Plain. A nitrate probability map was produced from this model. Results showed that elevated nitrate concentrations were correlated with land use, soil organic content, well depth, and water level. Detailed information on nitrate input, specifically fertilizer application, might have improved the effectiveness of this model.

  20. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks.

    PubMed

    Feng, Liang; Yuan, Shuai; Zhang, Liang-Liang; Tan, Kui; Li, Jia-Luo; Kirchon, Angelo; Liu, Ling-Mei; Zhang, Peng; Han, Yu; Chabal, Yves J; Zhou, Hong-Cai

    2018-02-14

    Sufficient pore size, appropriate stability, and hierarchical porosity are three prerequisites for open frameworks designed for drug delivery, enzyme immobilization, and catalysis involving large molecules. Herein, we report a powerful and general strategy, linker thermolysis, to construct ultrastable hierarchically porous metal-organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores by generating crystal defects throughout a microporous MOF crystal via thermolysis. The crystallinity and stability of HP-MOFs remain after thermolabile linkers are selectively removed from multivariate metal-organic frameworks (MTV-MOFs) through a decarboxylation process. A domain-based linker spatial distribution was found to be critical for creating hierarchical pores inside MTV-MOFs. Furthermore, linker thermolysis promotes the formation of ultrasmall metal oxide nanoparticles immobilized in an open framework that exhibits high catalytic activity for Lewis acid-catalyzed reactions. Most importantly, this work provides fresh insights into the connection between linker apportionment and vacancy distribution, which may shed light on probing the disordered linker apportionment in multivariate systems, a long-standing challenge in the study of MTV-MOFs.

  1. Transient Properties of Probability Distribution for a Markov Process with Size-dependent Additive Noise

    NASA Astrophysics Data System (ADS)

    Yamada, Yuhei; Yamazaki, Yoshihiro

    2018-04-01

    This study considered a stochastic model for cluster growth in a Markov process with a cluster size dependent additive noise. According to this model, the probability distribution of the cluster size transiently becomes an exponential or a log-normal distribution depending on the initial condition of the growth. In this letter, a master equation is obtained for this model, and derivation of the distributions is discussed.

  2. q-Gaussian distributions and multiplicative stochastic processes for analysis of multiple financial time series

    NASA Astrophysics Data System (ADS)

    Sato, Aki-Hiro

    2010-12-01

    This study considers q-Gaussian distributions and stochastic differential equations with both multiplicative and additive noises. In the M-dimensional case a q-Gaussian distribution can be theoretically derived as a stationary probability distribution of the multiplicative stochastic differential equation with both mutually independent multiplicative and additive noises. By using the proposed stochastic differential equation a method to evaluate a default probability under a given risk buffer is proposed.

  3. Net present value probability distributions from decline curve reserves estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Simpson, D.E.; Huffman, C.H.; Thompson, R.S.

    1995-12-31

    This paper demonstrates how reserves probability distributions can be used to develop net present value (NPV) distributions. NPV probability distributions were developed from the rate and reserves distributions presented in SPE 28333. This real data study used practicing engineer`s evaluations of production histories. Two approaches were examined to quantify portfolio risk. The first approach, the NPV Relative Risk Plot, compares the mean NPV with the NPV relative risk ratio for the portfolio. The relative risk ratio is the NPV standard deviation (a) divided the mean ({mu}) NPV. The second approach, a Risk - Return Plot, is a plot of themore » {mu} discounted cash flow rate of return (DCFROR) versus the {sigma} for the DCFROR distribution. This plot provides a risk-return relationship for comparing various portfolios. These methods may help evaluate property acquisition and divestiture alternatives and assess the relative risk of a suite of wells or fields for bank loans.« less

  4. Optimal random search for a single hidden target.

    PubMed

    Snider, Joseph

    2011-01-01

    A single target is hidden at a location chosen from a predetermined probability distribution. Then, a searcher must find a second probability distribution from which random search points are sampled such that the target is found in the minimum number of trials. Here it will be shown that if the searcher must get very close to the target to find it, then the best search distribution is proportional to the square root of the target distribution regardless of dimension. For a Gaussian target distribution, the optimum search distribution is approximately a Gaussian with a standard deviation that varies inversely with how close the searcher must be to the target to find it. For a network where the searcher randomly samples nodes and looks for the fixed target along edges, the optimum is either to sample a node with probability proportional to the square root of the out-degree plus 1 or not to do so at all.

  5. A Probability Based Framework for Testing the Missing Data Mechanism

    ERIC Educational Resources Information Center

    Lin, Johnny Cheng-Han

    2013-01-01

    Many methods exist for imputing missing data but fewer methods have been proposed to test the missing data mechanism. Little (1988) introduced a multivariate chi-square test for the missing completely at random data mechanism (MCAR) that compares observed means for each pattern with expectation-maximization (EM) estimated means. As an alternative,…

  6. On Arithmetic-Geometric-Mean Polynomials

    ERIC Educational Resources Information Center

    Griffiths, Martin; MacHale, Des

    2017-01-01

    We study here an aspect of an infinite set "P" of multivariate polynomials, the elements of which are associated with the arithmetic-geometric-mean inequality. In particular, we show in this article that there exist infinite subsets of probability "P" for which every element may be expressed as a finite sum of squares of real…

  7. Age of Parental Concern, Diagnosis, and Service Initiation among Children with Autism Spectrum Disorder

    ERIC Educational Resources Information Center

    Zablotsky, Benjamin; Colpe, Lisa J.; Pringle, Beverly A.; Kogan, Michael D.; Rice, Catherine; Blumberg, Stephen J.

    2017-01-01

    Children with autism spectrum disorder (ASD) require substantial support to address the core symptoms of ASD and co-occurring behavioral/developmental conditions. This study explores the early diagnostic experiences of school-aged children with ASD using survey data from a large probability-based national sample. Multivariate linear regressions…

  8. Copula-based analysis of rhythm

    NASA Astrophysics Data System (ADS)

    García, J. E.; González-López, V. A.; Viola, M. L. Lanfredi

    2016-06-01

    In this paper we establish stochastic profiles of the rhythm for three languages: English, Japanese and Spanish. We model the increase or decrease of the acoustical energy, collected into three bands coming from the acoustic signal. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination of the partitions corresponding to the three marginal processes, one for each band of energy, and the partition coming from to the multivariate Markov chain. Then, all the partitions are linked using a copula, in order to estimate the transition probabilities.

  9. Effect of abdominopelvic abscess drain size on drainage time and probability of occlusion.

    PubMed

    Rotman, Jessica A; Getrajdman, George I; Maybody, Majid; Erinjeri, Joseph P; Yarmohammadi, Hooman; Sofocleous, Constantinos T; Solomon, Stephen B; Boas, F Edward

    2017-04-01

    The purpose of this study is to determine whether larger abdominopelvic abscess drains reduce the time required for abscess resolution or the probability of tube occlusion. 144 consecutive patients who underwent abscess drainage at a single institution were reviewed retrospectively. Larger initial drain size did not reduce drainage time, drain occlusion, or drain exchanges (P > .05). Subgroup analysis did not find any type of collection that benefitted from larger drains. A multivariate model predicting drainage time showed that large collections (>200 mL) required 16 days longer drainage time than small collections (<50 mL). Collections with a fistula to bowel required 17 days longer drainage time than collections without a fistula. Initial drain size and the viscosity of the fluid in the collection had no significant effect on drainage time in the multivariate model. 8 F drains are adequate for initial drainage of most serous and serosanguineous collections. 10 F drains are adequate for initial drainage of most purulent or bloody collections. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Multi-scale Characterization and Modeling of Surface Slope Probability Distribution for ~20-km Diameter Lunar Craters

    NASA Astrophysics Data System (ADS)

    Mahanti, P.; Robinson, M. S.; Boyd, A. K.

    2013-12-01

    Craters ~20-km diameter and above significantly shaped the lunar landscape. The statistical nature of the slope distribution on their walls and floors dominate the overall slope distribution statistics for the lunar surface. Slope statistics are inherently useful for characterizing the current topography of the surface, determining accurate photometric and surface scattering properties, and in defining lunar surface trafficability [1-4]. Earlier experimental studies on the statistical nature of lunar surface slopes were restricted either by resolution limits (Apollo era photogrammetric studies) or by model error considerations (photoclinometric and radar scattering studies) where the true nature of slope probability distribution was not discernible at baselines smaller than a kilometer[2,3,5]. Accordingly, historical modeling of lunar surface slopes probability distributions for applications such as in scattering theory development or rover traversability assessment is more general in nature (use of simple statistical models such as the Gaussian distribution[1,2,5,6]). With the advent of high resolution, high precision topographic models of the Moon[7,8], slopes in lunar craters can now be obtained at baselines as low as 6-meters allowing unprecedented multi-scale (multiple baselines) modeling possibilities for slope probability distributions. Topographic analysis (Lunar Reconnaissance Orbiter Camera (LROC) Narrow Angle Camera (NAC) 2-m digital elevation models (DEM)) of ~20-km diameter Copernican lunar craters revealed generally steep slopes on interior walls (30° to 36°, locally exceeding 40°) over 15-meter baselines[9]. In this work, we extend the analysis from a probability distribution modeling point-of-view with NAC DEMs to characterize the slope statistics for the floors and walls for the same ~20-km Copernican lunar craters. The difference in slope standard deviations between the Gaussian approximation and the actual distribution (2-meter sampling) was computed over multiple scales. This slope analysis showed that local slope distributions are non-Gaussian for both crater walls and floors. Over larger baselines (~100 meters), crater wall slope probability distributions do approximate Gaussian distributions better, but have long distribution tails. Crater floor probability distributions however, were always asymmetric (for the baseline scales analyzed) and less affected by baseline scale variations. Accordingly, our results suggest that use of long tailed probability distributions (like Cauchy) and a baseline-dependant multi-scale model can be more effective in describing the slope statistics for lunar topography. Refrences: [1]Moore, H.(1971), JGR,75(11) [2]Marcus, A. H.(1969),JGR,74 (22).[3]R.J. Pike (1970),U.S. Geological Survey Working Paper [4]N. C. Costes, J. E. Farmer and E. B. George (1972),NASA Technical Report TR R-401 [5]M. N. Parker and G. L. Tyler(1973), Radio Science, 8(3),177-184 [6]Alekseev, V. A.et al (1968), Soviet Astronomy, Vol. 11, p.860 [7]Burns et al. (2012) Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XXXIX-B4, 483-488.[8]Smith et al. (2010) GRL 37, L18204, DOI: 10.1029/2010GL043751. [9]Wagner R., Robinson, M., Speyerer E., Mahanti, P., LPSC 2013, #2924.

  11. Development and application of an empirical probability distribution for the prediction error of re-entry body maximum dynamic pressure

    NASA Technical Reports Server (NTRS)

    Lanzi, R. James; Vincent, Brett T.

    1993-01-01

    The relationship between actual and predicted re-entry maximum dynamic pressure is characterized using a probability density function and a cumulative distribution function derived from sounding rocket flight data. This paper explores the properties of this distribution and demonstrates applications of this data with observed sounding rocket re-entry body damage characteristics to assess probabilities of sustaining various levels of heating damage. The results from this paper effectively bridge the gap existing in sounding rocket reentry analysis between the known damage level/flight environment relationships and the predicted flight environment.

  12. Probability and the changing shape of response distributions for orientation.

    PubMed

    Anderson, Britt

    2014-11-18

    Spatial attention and feature-based attention are regarded as two independent mechanisms for biasing the processing of sensory stimuli. Feature attention is held to be a spatially invariant mechanism that advantages a single feature per sensory dimension. In contrast to the prediction of location independence, I found that participants were able to report the orientation of a briefly presented visual grating better for targets defined by high probability conjunctions of features and locations even when orientations and locations were individually uniform. The advantage for high-probability conjunctions was accompanied by changes in the shape of the response distributions. High-probability conjunctions had error distributions that were not normally distributed but demonstrated increased kurtosis. The increase in kurtosis could be explained as a change in the variances of the component tuning functions that comprise a population mixture. By changing the mixture distribution of orientation-tuned neurons, it is possible to change the shape of the discrimination function. This prompts the suggestion that attention may not "increase" the quality of perceptual processing in an absolute sense but rather prioritizes some stimuli over others. This results in an increased number of highly accurate responses to probable targets and, simultaneously, an increase in the number of very inaccurate responses. © 2014 ARVO.

  13. Joint probabilities and quantum cognition

    NASA Astrophysics Data System (ADS)

    de Barros, J. Acacio

    2012-12-01

    In this paper we discuss the existence of joint probability distributions for quantumlike response computations in the brain. We do so by focusing on a contextual neural-oscillator model shown to reproduce the main features of behavioral stimulus-response theory. We then exhibit a simple example of contextual random variables not having a joint probability distribution, and describe how such variables can be obtained from neural oscillators, but not from a quantum observable algebra.

  14. Properties of the probability distribution associated with the largest event in an earthquake cluster and their implications to foreshocks.

    PubMed

    Zhuang, Jiancang; Ogata, Yosihiko

    2006-04-01

    The space-time epidemic-type aftershock sequence model is a stochastic branching process in which earthquake activity is classified into background and clustering components and each earthquake triggers other earthquakes independently according to certain rules. This paper gives the probability distributions associated with the largest event in a cluster and their properties for all three cases when the process is subcritical, critical, and supercritical. One of the direct uses of these probability distributions is to evaluate the probability of an earthquake to be a foreshock, and magnitude distributions of foreshocks and nonforeshock earthquakes. To verify these theoretical results, the Japan Meteorological Agency earthquake catalog is analyzed. The proportion of events that have 1 or more larger descendants in total events is found to be as high as about 15%. When the differences between background events and triggered event in the behavior of triggering children are considered, a background event has a probability about 8% to be a foreshock. This probability decreases when the magnitude of the background event increases. These results, obtained from a complicated clustering model, where the characteristics of background events and triggered events are different, are consistent with the results obtained in [Ogata, Geophys. J. Int. 127, 17 (1996)] by using the conventional single-linked cluster declustering method.

  15. The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference: an application to longitudinal modeling.

    PubMed

    Heggeseth, Brianna C; Jewell, Nicholas P

    2013-07-20

    Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence-that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.

  16. Probability of detecting atrazine/desethyl-atrazine and elevated concentrations of nitrate (NO2+NO3-N) in ground water in the Idaho part of the upper Snake River basin

    USGS Publications Warehouse

    Rupert, Michael G.

    1998-01-01

    Draft Federal regulations may require that each State develop a State Pesticide Management Plan for the herbicides atrazine, alachlor, cyanazine, metolachlor, and simazine. This study developed maps that the Idaho State Department of Agriculture might use to predict the probability of detecting atrazine and desethyl-atrazine (a breakdown product of atrazine) in ground water in the Idaho part of the upper Snake River Basin. These maps can be incorporated in the State Pesticide Management Plan and help provide a sound hydrogeologic basis for atrazine management in the study area. Maps showing the probability of detecting atrazine/desethyl-atrazine in ground water were developed as follows: (1) Ground-water monitoring data were overlaid with hydrogeologic and anthropogenic data using a geographic information system to produce a data set in which each well had corresponding data on atrazine use, depth to ground water, geology, land use, precipitation, soils, and well depth. These data then were downloaded to a statistical software package for analysis by logistic regression. (2) Individual (univariate) relations between atrazine/desethyl-atrazine in ground water and atrazine use, depth to ground water, geology, land use, precipitation, soils, and well depth data were evaluated to identify those independent variables significantly related to atrazine/ desethyl-atrazine detections. (3) Several preliminary multivariate models with various combinations of independent variables were constructed. (4) The multivariate models which best predicted the presence of atrazine/desethyl-atrazine in ground water were selected. (5) The multivariate models were entered into the geographic information system and the probability maps were constructed. Two models which best predicted the presence of atrazine/desethyl-atrazine in ground water were selected; one with and one without atrazine use. Correlations of the predicted probabilities of atrazine/desethyl-atrazine in ground water with the percent of actual detections were good; r-squared values were 0.91 and 0.96, respectively. Models were verified using a second set of groundwater quality data. Verification showed that wells with water containing atrazine/desethyl-atrazine had significantly higher probability ratings than wells with water containing no atrazine/desethylatrazine (p <0.002). Logistic regression also was used to develop a preliminary model to predict the probability of nitrite plus nitrate as nitrogen concentrations greater than background levels of 2 milligrams per liter. A direct comparison between the atrazine/ desethyl-atrazine and nitrite plus nitrate as nitrogen probability maps was possible because the same ground-water monitoring, hydrogeologic, and anthropogenic data were used to develop both maps. Land use, precipitation, soil hydrologic group, and well depth were significantly related with atrazine/desethyl-atrazine detections. Depth to water, land use, and soil drainage were signifi- cantly related with elevated nitrite plus nitrate as nitrogen concentrations. The differences between atrazine/desethyl-atrazine and nitrite plus nitrate as nitrogen relations were attributed to differences in chemical behavior of these compounds in the environment and possibly to differences in the extent of use and rates of their application.

  17. Does Breast Cancer Drive the Building of Survival Probability Models among States? An Assessment of Goodness of Fit for Patient Data from SEER Registries

    PubMed

    Khan, Hafiz; Saxena, Anshul; Perisetti, Abhilash; Rafiq, Aamrin; Gabbidon, Kemesha; Mende, Sarah; Lyuksyutova, Maria; Quesada, Kandi; Blakely, Summre; Torres, Tiffany; Afesse, Mahlet

    2016-12-01

    Background: Breast cancer is a worldwide public health concern and is the most prevalent type of cancer in women in the United States. This study concerned the best fit of statistical probability models on the basis of survival times for nine state cancer registries: California, Connecticut, Georgia, Hawaii, Iowa, Michigan, New Mexico, Utah, and Washington. Materials and Methods: A probability random sampling method was applied to select and extract records of 2,000 breast cancer patients from the Surveillance Epidemiology and End Results (SEER) database for each of the nine state cancer registries used in this study. EasyFit software was utilized to identify the best probability models by using goodness of fit tests, and to estimate parameters for various statistical probability distributions that fit survival data. Results: Statistical analysis for the summary of statistics is reported for each of the states for the years 1973 to 2012. Kolmogorov-Smirnov, Anderson-Darling, and Chi-squared goodness of fit test values were used for survival data, the highest values of goodness of fit statistics being considered indicative of the best fit survival model for each state. Conclusions: It was found that California, Connecticut, Georgia, Iowa, New Mexico, and Washington followed the Burr probability distribution, while the Dagum probability distribution gave the best fit for Michigan and Utah, and Hawaii followed the Gamma probability distribution. These findings highlight differences between states through selected sociodemographic variables and also demonstrate probability modeling differences in breast cancer survival times. The results of this study can be used to guide healthcare providers and researchers for further investigations into social and environmental factors in order to reduce the occurrence of and mortality due to breast cancer. Creative Commons Attribution License

  18. Role of the site of synaptic competition and the balance of learning forces for Hebbian encoding of probabilistic Markov sequences

    PubMed Central

    Bouchard, Kristofer E.; Ganguli, Surya; Brainard, Michael S.

    2015-01-01

    The majority of distinct sensory and motor events occur as temporally ordered sequences with rich probabilistic structure. Sequences can be characterized by the probability of transitioning from the current state to upcoming states (forward probability), as well as the probability of having transitioned to the current state from previous states (backward probability). Despite the prevalence of probabilistic sequencing of both sensory and motor events, the Hebbian mechanisms that mold synapses to reflect the statistics of experienced probabilistic sequences are not well understood. Here, we show through analytic calculations and numerical simulations that Hebbian plasticity (correlation, covariance, and STDP) with pre-synaptic competition can develop synaptic weights equal to the conditional forward transition probabilities present in the input sequence. In contrast, post-synaptic competition can develop synaptic weights proportional to the conditional backward probabilities of the same input sequence. We demonstrate that to stably reflect the conditional probability of a neuron's inputs and outputs, local Hebbian plasticity requires balance between competitive learning forces that promote synaptic differentiation and homogenizing learning forces that promote synaptic stabilization. The balance between these forces dictates a prior over the distribution of learned synaptic weights, strongly influencing both the rate at which structure emerges and the entropy of the final distribution of synaptic weights. Together, these results demonstrate a simple correspondence between the biophysical organization of neurons, the site of synaptic competition, and the temporal flow of information encoded in synaptic weights by Hebbian plasticity while highlighting the utility of balancing learning forces to accurately encode probability distributions, and prior expectations over such probability distributions. PMID:26257637

  19. Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

    PubMed

    Ma, Yan; Mazumdar, Madhu

    2011-10-30

    Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review.

    PubMed

    Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

    2017-12-01

    Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. A method to deconvolve stellar rotational velocities II. The probability distribution function via Tikhonov regularization

    NASA Astrophysics Data System (ADS)

    Christen, Alejandra; Escarate, Pedro; Curé, Michel; Rial, Diego F.; Cassetti, Julia

    2016-10-01

    Aims: Knowing the distribution of stellar rotational velocities is essential for understanding stellar evolution. Because we measure the projected rotational speed v sin I, we need to solve an ill-posed problem given by a Fredholm integral of the first kind to recover the "true" rotational velocity distribution. Methods: After discretization of the Fredholm integral we apply the Tikhonov regularization method to obtain directly the probability distribution function for stellar rotational velocities. We propose a simple and straightforward procedure to determine the Tikhonov parameter. We applied Monte Carlo simulations to prove that the Tikhonov method is a consistent estimator and asymptotically unbiased. Results: This method is applied to a sample of cluster stars. We obtain confidence intervals using a bootstrap method. Our results are in close agreement with those obtained using the Lucy method for recovering the probability density distribution of rotational velocities. Furthermore, Lucy estimation lies inside our confidence interval. Conclusions: Tikhonov regularization is a highly robust method that deconvolves the rotational velocity probability density function from a sample of v sin I data directly without the need for any convergence criteria.

  2. Bayesian transformation cure frailty models with multivariate failure time data.

    PubMed

    Yin, Guosheng

    2008-12-10

    We propose a class of transformation cure frailty models to accommodate a survival fraction in multivariate failure time data. Established through a general power transformation, this family of cure frailty models includes the proportional hazards and the proportional odds modeling structures as two special cases. Within the Bayesian paradigm, we obtain the joint posterior distribution and the corresponding full conditional distributions of the model parameters for the implementation of Gibbs sampling. Model selection is based on the conditional predictive ordinate statistic and deviance information criterion. As an illustration, we apply the proposed method to a real data set from dentistry.

  3. Investigating future climate change impacts on drougt patterns over the Euro-Mediterranean area based on a probabilistic approach

    NASA Astrophysics Data System (ADS)

    Bonaccorso, Brunella; Peres, David Johnny; Cancelliere, Antonino

    2017-04-01

    As extensively documented by the IPCC assessment reports, impacts from recent climate-related extremes, such as heat waves, droughts, floods, cyclones and wildfires, reveal significant vulnerability of many environmental and anthropic systems to climate change. Compared to other extreme weather events, droughts evolve slowly in time. Based on this feature, effective drought preparedness and mitigation strategies could be implemented by decision makers, if appropriate tools, able to anticipate drought evolution in time and space, were available. Climate models' projections combined with probabilistic tools for drought characterization could help in understanding the time evolution of drought hazard in the future. Within the delineated context, the aim of the present study is to investigate potential scenarios of space-time variability of drought occurrences over Europe, by comparing the return periods of design drought events for different future time intervals. More specifically, annual precipitation data from Regional Climate Models (RCMs) of the Med-CORDEX initiative, covering the Euro-Mediterranean area (Northern Africa and Southern and Central Europe) at a grid resolution of about 50 km, are used to assess drought characteristics for three future periods (i.e. 2011-2040, 2041-2070 and 2071-2100), and compared to those in the baseline period (1971-2000). Specifically, three precipitation RCM datasets - produced by the CMMC (Euro-Mediterranean Center on Climate Change, IT), the LMD (Laboratorie del Météorologie Dynamique, FR) and the GUF (Goethe University Frankfurt, DE) - for two Representative Concentration Pathways, RCP 4.5 (intermediate) and RCP8.5 (high emissions), are considered for multi-year drought identification and characterization. First, the goodness of fit of several probability distributions to the considered precipitation gridded dataset is examined cell by cell by the Lilliefors test, and the best distribution is chosen for each cell based on the lowest value of the test statistic. Then, the marginal and multivariate probability distributions of drought characteristics (duration and accumulated deficit) are derived as functions of the parameters of the probability distribution of precipitation and the threshold level selected to identify droughts as negative runs. Finally, the return periods of design drought events are computed as the expected value of the interarrival time between consecutive critical droughts, and the possible spatial patterns are investigated. In general, results confirm an increasing occurrence of severe drought episodes in several regions of the investigated area in the future, although some discordances arise with respect to the different projections over the considered future periods. Apparently, Central Eastern regions of the Mediterranean are likely to become more drought prone, as low values of return periods are obtained.

  4. "SABER": A new software tool for radiotherapy treatment plan evaluation.

    PubMed

    Zhao, Bo; Joiner, Michael C; Orton, Colin G; Burmeister, Jay

    2010-11-01

    Both spatial and biological information are necessary in order to perform true optimization of a treatment plan and for predicting clinical outcome. The goal of this work is to develop an enhanced treatment plan evaluation tool which incorporates biological parameters and retains spatial dose information. A software system is developed which provides biological plan evaluation with a novel combination of features. It incorporates hyper-radiosensitivity using the induced-repair model and applies the new concept of dose convolution filter (DCF) to simulate dose wash-out effects due to cell migration, bystander effect, and/or tissue motion during treatment. Further, the concept of spatial DVH (sDVH) is introduced to evaluate and potentially optimize the spatial dose distribution in the target volume. Finally, generalized equivalent uniform dose is derived from both the physical dose distribution (gEUD) and the distribution of equivalent dose in 2 Gy fractions (gEUD2) and the software provides three separate models for calculation of tumor control probability (TCP), normal tissue complication probability (NTCP), and probability of uncomplicated tumor control (P+). TCP, NTCP, and P+ are provided as a function of prescribed dose and multivariable TCP, NTCP, and P+ plots are provided to illustrate the dependence on individual parameters used to calculate these quantities. Ten plans from two clinical treatment sites are selected to test the three calculation models provided by this software. By retaining both spatial and biological information about the dose distribution, the software is able to distinguish features of radiotherapy treatment plans not discernible using commercial systems. Plans that have similar DVHs may have different spatial and biological characteristics and the application of novel tools such as sDVH and DCF within the software may substantially change the apparent plan quality or predicted plan metrics such as TCP and NTCP. For the cases examined, both the calculation method and the application of DCF can change the ranking order of competing plans. The voxel-by-voxel TCP model makes it feasible to incorporate spatial variations of clonogen densities (n), radiosensitivities (SF2), and fractionation sensitivities (alpha/beta) as those data become available. The new software incorporates both spatial and biological information into the treatment planning process. The application of multiple methods for the incorporation of biological and spatial information has demonstrated that the order of application of biological models can change the order of plan ranking. Thus, the results of plan evaluation and optimization are dependent not only on the models used but also on the order in which they are applied. This software can help the planner choose more biologically optimal treatment plans and potentially predict treatment outcome more accurately.

  5. Quantitative assessment of building fire risk to life safety.

    PubMed

    Guanquan, Chu; Jinhua, Sun

    2008-06-01

    This article presents a quantitative risk assessment framework for evaluating fire risk to life safety. Fire risk is divided into two parts: probability and corresponding consequence of every fire scenario. The time-dependent event tree technique is used to analyze probable fire scenarios based on the effect of fire protection systems on fire spread and smoke movement. To obtain the variation of occurrence probability with time, Markov chain is combined with a time-dependent event tree for stochastic analysis on the occurrence probability of fire scenarios. To obtain consequences of every fire scenario, some uncertainties are considered in the risk analysis process. When calculating the onset time to untenable conditions, a range of fires are designed based on different fire growth rates, after which uncertainty of onset time to untenable conditions can be characterized by probability distribution. When calculating occupant evacuation time, occupant premovement time is considered as a probability distribution. Consequences of a fire scenario can be evaluated according to probability distribution of evacuation time and onset time of untenable conditions. Then, fire risk to life safety can be evaluated based on occurrence probability and consequences of every fire scenario. To express the risk assessment method in detail, a commercial building is presented as a case study. A discussion compares the assessment result of the case study with fire statistics.

  6. Competing risk models in reliability systems, an exponential distribution model with Bayesian analysis approach

    NASA Astrophysics Data System (ADS)

    Iskandar, I.

    2018-03-01

    The exponential distribution is the most widely used reliability analysis. This distribution is very suitable for representing the lengths of life of many cases and is available in a simple statistical form. The characteristic of this distribution is a constant hazard rate. The exponential distribution is the lower rank of the Weibull distributions. In this paper our effort is to introduce the basic notions that constitute an exponential competing risks model in reliability analysis using Bayesian analysis approach and presenting their analytic methods. The cases are limited to the models with independent causes of failure. A non-informative prior distribution is used in our analysis. This model describes the likelihood function and follows with the description of the posterior function and the estimations of the point, interval, hazard function, and reliability. The net probability of failure if only one specific risk is present, crude probability of failure due to a specific risk in the presence of other causes, and partial crude probabilities are also included.

  7. Multivariate Distributions in Reliability Theory and Life Testing.

    DTIC Science & Technology

    1981-04-01

    Downton Distribution This distribution is a special case of a classical bivariate gamma distribution due to Wicksell and to Kibble. See Krishnaiah and...Krishnamoorthy and Parthasarathy (1951) (see also Krishnaiah and Rao (1961) and Krishnaiah (1977))and also within the frame- 13 work of the Arnold classes. A...for these distributions and their properties is Johnson and Kotz (1972). Krishnaiah (1977) has specifically discussed multi- variate gamma

  8. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

    PubMed Central

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

    2013-01-01

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436

  9. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

    PubMed

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

    2013-10-15

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.

  10. On probability-possibility transformations

    NASA Technical Reports Server (NTRS)

    Klir, George J.; Parviz, Behzad

    1992-01-01

    Several probability-possibility transformations are compared in terms of the closeness of preserving second-order properties. The comparison is based on experimental results obtained by computer simulation. Two second-order properties are involved in this study: noninteraction of two distributions and projections of a joint distribution.

  11. Theoretical size distribution of fossil taxa: analysis of a null model.

    PubMed

    Reed, William J; Hughes, Barry D

    2007-03-22

    This article deals with the theoretical size distribution (of number of sub-taxa) of a fossil taxon arising from a simple null model of macroevolution. New species arise through speciations occurring independently and at random at a fixed probability rate, while extinctions either occur independently and at random (background extinctions) or cataclysmically. In addition new genera are assumed to arise through speciations of a very radical nature, again assumed to occur independently and at random at a fixed probability rate. The size distributions of the pioneering genus (following a cataclysm) and of derived genera are determined. Also the distribution of the number of genera is considered along with a comparison of the probability of a monospecific genus with that of a monogeneric family.

  12. Newton/Poisson-Distribution Program

    NASA Technical Reports Server (NTRS)

    Bowerman, Paul N.; Scheuer, Ernest M.

    1990-01-01

    NEWTPOIS, one of two computer programs making calculations involving cumulative Poisson distributions. NEWTPOIS (NPO-17715) and CUMPOIS (NPO-17714) used independently of one another. NEWTPOIS determines Poisson parameter for given cumulative probability, from which one obtains percentiles for gamma distributions with integer shape parameters and percentiles for X(sup2) distributions with even degrees of freedom. Used by statisticians and others concerned with probabilities of independent events occurring over specific units of time, area, or volume. Program written in C.

  13. Development of multivariate NTCP models for radiation-induced hypothyroidism: a comparative analysis.

    PubMed

    Cella, Laura; Liuzzi, Raffaele; Conson, Manuel; D'Avino, Vittoria; Salvatore, Marco; Pacelli, Roberto

    2012-12-27

    Hypothyroidism is a frequent late side effect of radiation therapy of the cervical region. Purpose of this work is to develop multivariate normal tissue complication probability (NTCP) models for radiation-induced hypothyroidism (RHT) and to compare them with already existing NTCP models for RHT. Fifty-three patients treated with sequential chemo-radiotherapy for Hodgkin's lymphoma (HL) were retrospectively reviewed for RHT events. Clinical information along with thyroid gland dose distribution parameters were collected and their correlation to RHT was analyzed by Spearman's rank correlation coefficient (Rs). Multivariate logistic regression method using resampling methods (bootstrapping) was applied to select model order and parameters for NTCP modeling. Model performance was evaluated through the area under the receiver operating characteristic curve (AUC). Models were tested against external published data on RHT and compared with other published NTCP models. If we express the thyroid volume exceeding X Gy as a percentage (Vx(%)), a two-variable NTCP model including V30(%) and gender resulted to be the optimal predictive model for RHT (Rs = 0.615, p < 0.001. AUC = 0.87). Conversely, if absolute thyroid volume exceeding X Gy (Vx(cc)) was analyzed, an NTCP model based on 3 variables including V30(cc), thyroid gland volume and gender was selected as the most predictive model (Rs = 0.630, p < 0.001. AUC = 0.85). The three-variable model performs better when tested on an external cohort characterized by large inter-individuals variation in thyroid volumes (AUC = 0.914, 95% CI 0.760-0.984). A comparable performance was found between our model and that proposed in the literature based on thyroid gland mean dose and volume (p = 0.264). The absolute volume of thyroid gland exceeding 30 Gy in combination with thyroid gland volume and gender provide an NTCP model for RHT with improved prediction capability not only within our patient population but also in an external cohort.

  14. Probability theory for 3-layer remote sensing radiative transfer model: univariate case.

    PubMed

    Ben-David, Avishai; Davidson, Charles E

    2012-04-23

    A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics. © 2012 Optical Society of America

  15. Sampling probability distributions of lesions in mammograms

    NASA Astrophysics Data System (ADS)

    Looney, P.; Warren, L. M.; Dance, D. R.; Young, K. C.

    2015-03-01

    One approach to image perception studies in mammography using virtual clinical trials involves the insertion of simulated lesions into normal mammograms. To facilitate this, a method has been developed that allows for sampling of lesion positions across the cranio-caudal and medio-lateral radiographic projections in accordance with measured distributions of real lesion locations. 6825 mammograms from our mammography image database were segmented to find the breast outline. The outlines were averaged and smoothed to produce an average outline for each laterality and radiographic projection. Lesions in 3304 mammograms with malignant findings were mapped on to a standardised breast image corresponding to the average breast outline using piecewise affine transforms. A four dimensional probability distribution function was found from the lesion locations in the cranio-caudal and medio-lateral radiographic projections for calcification and noncalcification lesions. Lesion locations sampled from this probability distribution function were mapped on to individual mammograms using a piecewise affine transform which transforms the average outline to the outline of the breast in the mammogram. The four dimensional probability distribution function was validated by comparing it to the two dimensional distributions found by considering each radiographic projection and laterality independently. The correlation of the location of the lesions sampled from the four dimensional probability distribution function across radiographic projections was shown to match the correlation of the locations of the original mapped lesion locations. The current system has been implemented as a web-service on a server using the Python Django framework. The server performs the sampling, performs the mapping and returns the results in a javascript object notation format.

  16. Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

    DOE PAGES

    Fierce, Laura; McGraw, Robert L.

    2017-07-26

    Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less

  17. Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fierce, Laura; McGraw, Robert L.

    Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less

  18. How to model a negligible probability under the WTO sanitary and phytosanitary agreement?

    PubMed

    Powell, Mark R

    2013-06-01

    Since the 1997 EC--Hormones decision, World Trade Organization (WTO) Dispute Settlement Panels have wrestled with the question of what constitutes a negligible risk under the Sanitary and Phytosanitary Agreement. More recently, the 2010 WTO Australia--Apples Panel focused considerable attention on the appropriate quantitative model for a negligible probability in a risk assessment. The 2006 Australian Import Risk Analysis for Apples from New Zealand translated narrative probability statements into quantitative ranges. The uncertainty about a "negligible" probability was characterized as a uniform distribution with a minimum value of zero and a maximum value of 10(-6) . The Australia - Apples Panel found that the use of this distribution would tend to overestimate the likelihood of "negligible" events and indicated that a triangular distribution with a most probable value of zero and a maximum value of 10⁻⁶ would correct the bias. The Panel observed that the midpoint of the uniform distribution is 5 × 10⁻⁷ but did not consider that the triangular distribution has an expected value of 3.3 × 10⁻⁷. Therefore, if this triangular distribution is the appropriate correction, the magnitude of the bias found by the Panel appears modest. The Panel's detailed critique of the Australian risk assessment, and the conclusions of the WTO Appellate Body about the materiality of flaws found by the Panel, may have important implications for the standard of review for risk assessments under the WTO SPS Agreement. © 2012 Society for Risk Analysis.

  19. Fourier Method for Calculating Fission Chain Neutron Multiplicity Distributions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chambers, David H.; Chandrasekaran, Hema; Walston, Sean E.

    Here, a new way of utilizing the fast Fourier transform is developed to compute the probability distribution for a fission chain to create n neutrons. We then extend this technique to compute the probability distributions for detecting n neutrons. Lastly, our technique can be used for fission chains initiated by either a single neutron inducing a fission or by the spontaneous fission of another isotope.

  20. Fourier Method for Calculating Fission Chain Neutron Multiplicity Distributions

    DOE PAGES

    Chambers, David H.; Chandrasekaran, Hema; Walston, Sean E.

    2017-03-27

    Here, a new way of utilizing the fast Fourier transform is developed to compute the probability distribution for a fission chain to create n neutrons. We then extend this technique to compute the probability distributions for detecting n neutrons. Lastly, our technique can be used for fission chains initiated by either a single neutron inducing a fission or by the spontaneous fission of another isotope.

  1. Teaching Uncertainties

    ERIC Educational Resources Information Center

    Duerdoth, Ian

    2009-01-01

    The subject of uncertainties (sometimes called errors) is traditionally taught (to first-year science undergraduates) towards the end of a course on statistics that defines probability as the limit of many trials, and discusses probability distribution functions and the Gaussian distribution. We show how to introduce students to the concepts of…

  2. Robust approaches to quantification of margin and uncertainty for sparse data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hund, Lauren; Schroeder, Benjamin B.; Rumsey, Kelin

    Characterizing the tails of probability distributions plays a key role in quantification of margins and uncertainties (QMU), where the goal is characterization of low probability, high consequence events based on continuous measures of performance. When data are collected using physical experimentation, probability distributions are typically fit using statistical methods based on the collected data, and these parametric distributional assumptions are often used to extrapolate about the extreme tail behavior of the underlying probability distribution. In this project, we character- ize the risk associated with such tail extrapolation. Specifically, we conducted a scaling study to demonstrate the large magnitude of themore » risk; then, we developed new methods for communicat- ing risk associated with tail extrapolation from unvalidated statistical models; lastly, we proposed a Bayesian data-integration framework to mitigate tail extrapolation risk through integrating ad- ditional information. We conclude that decision-making using QMU is a complex process that cannot be achieved using statistical analyses alone.« less

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Hang, E-mail: hangchen@mit.edu; Thill, Peter; Cao, Jianshu

    In biochemical systems, intrinsic noise may drive the system switch from one stable state to another. We investigate how kinetic switching between stable states in a bistable network is influenced by dynamic disorder, i.e., fluctuations in the rate coefficients. Using the geometric minimum action method, we first investigate the optimal transition paths and the corresponding minimum actions based on a genetic toggle switch model in which reaction coefficients draw from a discrete probability distribution. For the continuous probability distribution of the rate coefficient, we then consider two models of dynamic disorder in which reaction coefficients undergo different stochastic processes withmore » the same stationary distribution. In one, the kinetic parameters follow a discrete Markov process and in the other they follow continuous Langevin dynamics. We find that regulation of the parameters modulating the dynamic disorder, as has been demonstrated to occur through allosteric control in bistable networks in the immune system, can be crucial in shaping the statistics of optimal transition paths, transition probabilities, and the stationary probability distribution of the network.« less

  4. Detecting background changes in environments with dynamic foreground by separating probability distribution function mixtures using Pearson's method of moments

    NASA Astrophysics Data System (ADS)

    Jenkins, Colleen; Jordan, Jay; Carlson, Jeff

    2007-02-01

    This paper presents parameter estimation techniques useful for detecting background changes in a video sequence with extreme foreground activity. A specific application of interest is automated detection of the covert placement of threats (e.g., a briefcase bomb) inside crowded public facilities. We propose that a histogram of pixel intensity acquired from a fixed mounted camera over time for a series of images will be a mixture of two Gaussian functions: the foreground probability distribution function and background probability distribution function. We will use Pearson's Method of Moments to separate the two probability distribution functions. The background function can then be "remembered" and changes in the background can be detected. Subsequent comparisons of background estimates are used to detect changes. Changes are flagged to alert security forces to the presence and location of potential threats. Results are presented that indicate the significant potential for robust parameter estimation techniques as applied to video surveillance.

  5. Acute convexity subarachnoid haemorrhage and cortical superficial siderosis in probable cerebral amyloid angiopathy without lobar haemorrhage.

    PubMed

    Charidimou, Andreas; Boulouis, Grégoire; Fotiadis, Panagiotis; Xiong, Li; Ayres, Alison M; Schwab, Kristin M; Gurol, Mahmut Edip; Rosand, Jonathan; Greenberg, Steve M; Viswanathan, Anand

    2018-04-01

    Acute non-traumatic convexity subarachnoid haemorrhage (cSAH) is increasingly recognised in cerebral amyloid angiopathy (CAA). We investigated: (a) the overlap between acute cSAH and cortical superficial siderosis-a new CAA haemorrhagic imaging signature and (b) whether acute cSAH presents with particular clinical symptoms in patients with probable CAA without lobar intracerebral haemorrhage. MRI scans of 130 consecutive patients meeting modified Boston criteria for probable CAA were analysed for cortical superficial siderosis (focal, ≤3 sulci; disseminated, ≥4 sulci), and key small vessel disease markers. We compared clinical, imaging and cortical superficial siderosis topographical mapping data between subjects with versus without acute cSAH, using multivariable logistic regression. We included 33 patients with probable CAA presenting with acute cSAH and 97 without cSAH at presentation. Patients with acute cSAH were more commonly presenting with transient focal neurological episodes (76% vs 34%; p<0.0001) compared with patients with CAA without cSAH. Patients with acute cSAH were also more often clinically presenting with transient focal neurological episodes compared with cortical superficial siderosis-positive, but cSAH-negative subjects with CAA (76% vs 30%; p<0.0001). Cortical superficial siderosis prevalence (but no other CAA severity markers) was higher among patients with cSAH versus those without, especially disseminated cortical superficial siderosis (49% vs 19%; p<0.0001). In multivariable logistic regression, cortical superficial siderosis burden (OR 5.53; 95% CI 2.82 to 10.8, p<0.0001) and transient focal neurological episodes (OR 11.7; 95% CI 2.70 to 50.6, p=0.001) were independently associated with acute cSAH. This probable CAA cohort provides additional evidence for distinct disease phenotypes, determined by the presence of cSAH and cortical superficial siderosis. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Comparison of Different Risk Perception Measures in Predicting Seasonal Influenza Vaccination among Healthy Chinese Adults in Hong Kong: A Prospective Longitudinal Study

    PubMed Central

    Liao, Qiuyan; Wong, Wing Sze; Fielding, Richard

    2013-01-01

    Background Risk perception is a reported predictor of vaccination uptake, but which measures of risk perception best predict influenza vaccination uptake remain unclear. Methodology During the main influenza seasons (between January and March) of 2009 (Wave 1) and 2010 (Wave 2),505 Chinese students and employees from a Hong Kong university completed an online survey. Multivariate logistic regression models were conducted to assess how well different risk perceptions measures in Wave 1 predicted vaccination uptake against seasonal influenza in Wave 2. Principal Findings The results of the multivariate logistic regression models showed that feeling at risk (β = 0.25, p = 0.021) was the better predictor compared with probability judgment while probability judgment (β = 0.25, p = 0.029 ) was better than beliefs about risk in predicting subsequent influenza vaccination uptake. Beliefs about risk and feeling at risk seemed to predict the same aspect of subsequent vaccination uptake because their associations with vaccination uptake became insignificant when paired into the logistic regression model. Similarly, to compare the four scales for assessing probability judgment in predicting vaccination uptake, the 7-point verbal scale remained a significant and stronger predictor for vaccination uptake when paired with other three scales; the 6-point verbal scale was a significant and stronger predictor when paired with the percentage scale or the 2-point verbal scale; and the percentage scale was a significant and stronger predictor only when paired with the 2-point verbal scale. Conclusions/Significance Beliefs about risk and feeling at risk are not well differentiated by Hong Kong Chinese people. Feeling at risk, an affective-cognitive dimension of risk perception predicts subsequent vaccination uptake better than do probability judgments. Among the four scales for assessing risk probability judgment, the 7-point verbal scale offered the best predictive power for subsequent vaccination uptake. PMID:23894292

  7. Comparison of different risk perception measures in predicting seasonal influenza vaccination among healthy Chinese adults in Hong Kong: a prospective longitudinal study.

    PubMed

    Liao, Qiuyan; Wong, Wing Sze; Fielding, Richard

    2013-01-01

    Risk perception is a reported predictor of vaccination uptake, but which measures of risk perception best predict influenza vaccination uptake remain unclear. During the main influenza seasons (between January and March) of 2009 (Wave 1) and 2010 (Wave 2),505 Chinese students and employees from a Hong Kong university completed an online survey. Multivariate logistic regression models were conducted to assess how well different risk perceptions measures in Wave 1 predicted vaccination uptake against seasonal influenza in Wave 2. The results of the multivariate logistic regression models showed that feeling at risk (β = 0.25, p = 0.021) was the better predictor compared with probability judgment while probability judgment (β = 0.25, p = 0.029 ) was better than beliefs about risk in predicting subsequent influenza vaccination uptake. Beliefs about risk and feeling at risk seemed to predict the same aspect of subsequent vaccination uptake because their associations with vaccination uptake became insignificant when paired into the logistic regression model. Similarly, to compare the four scales for assessing probability judgment in predicting vaccination uptake, the 7-point verbal scale remained a significant and stronger predictor for vaccination uptake when paired with other three scales; the 6-point verbal scale was a significant and stronger predictor when paired with the percentage scale or the 2-point verbal scale; and the percentage scale was a significant and stronger predictor only when paired with the 2-point verbal scale. Beliefs about risk and feeling at risk are not well differentiated by Hong Kong Chinese people. Feeling at risk, an affective-cognitive dimension of risk perception predicts subsequent vaccination uptake better than do probability judgments. Among the four scales for assessing risk probability judgment, the 7-point verbal scale offered the best predictive power for subsequent vaccination uptake.

  8. Predicting poor outcome following hepatectomy: analysis of 2313 hepatectomies in the NSQIP database

    PubMed Central

    Aloia, Thomas A; Fahy, Bridget N; Fischer, Craig P; Jones, Stephen L; Duchini, Andrea; Galati, Joseph; Gaber, A Osama; Ghobrial, R Mark; Bass, Barbara L

    2009-01-01

    Background: For the past two decades multiple series have documented that liver resection has become safer. The purpose of this study was to determine the current status of hepatic resection in the USA by analysing the multi-institutional experience within the National Surgical Quality Improvement Program (NSQIP) dataset. Methods: Of the 363 897 cases in the 2005–2007 NSQIP Participant Use File, 2313 elective open hepatectomy cases were identified (1344 partial, 230 left, 510 right and 229 extended hepatectomies). A total of 57 perioperative risk factors and 28 postoperative complications were compared. To determine the applicability of NSQIP general risk models to hepatic surgery, the prognostic value of standard multivariate analysis was compared with the NSQIP general surgery aggregate risk indices (expected probability of morbidity [morbprob], expected probability of mortality [mortprob]). Results: The median age of patients listed in the database was 60 years; sex distributions were equivalent; 78% were White; 65% of patients had an ASA score of 3 or 4, and the most prevalent co-morbidity was hypertension (46%). A total of 41% of patients had disseminated cancer, 19% of whom had received chemotherapy within 30 days of surgery. The overall 30-day mortality rate was 2.5% (57/2313) and the 30-day major morbidity rate was 19.6% (453/2313). Multivariate analysis identified nine risk factors associated with major morbidity and two risk factors associated with mortality. In contrast, the morbprob and mortprob statistics did not predict outcomes accurately. For those patients who developed major morbidity, the median length of stay was longer (10 vs. 6 days; P= 0.001) and the mortality rate was higher (11.3% vs. 0.3%; P= 0.001). Conclusions: Analysis of the NSQIP experience with hepatectomy indicates that the current mortality and major morbidity rate benchmarks are 2.5% and 19.6%, respectively. Poor outcomes were associated with nutritional status, liver function and the extent of hepatectomy. The NSQIP general surgery morbprob and mortprob values were relatively poor predictors of post-hepatectomy observed morbidity, indicating the need for specialty-specific NSQIP modelling. PMID:19816616

  9. Multivariate Generalizations of Student's t-Distribution. ONR Technical Report. [Biometric Lab Report No. 90-3.

    ERIC Educational Resources Information Center

    Gibbons, Robert D.; And Others

    In the process of developing a conditionally-dependent item response theory (IRT) model, the problem arose of modeling an underlying multivariate normal (MVN) response process with general correlation among the items. Without the assumption of conditional independence, for which the underlying MVN cdf takes on comparatively simple forms and can be…

  10. Bias and Precision of Measures of Association for a Fixed-Effect Multivariate Analysis of Variance Model

    ERIC Educational Resources Information Center

    Kim, Soyoung; Olejnik, Stephen

    2005-01-01

    The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…

  11. Siblings' premarital childbearing and the timing of first sex in three major cities of Cote d'Ivoire.

    PubMed

    Diop-Sidibe, Nafissatou

    2005-06-01

    The association between youths' sexual and reproductive attitudes and behaviors and those of their peers and parents has been documented; however, information on siblings' influence is scarce, especially for developing countries. Data on 1,395 female and 1,242 male survey respondents aged 15-24 from three cities in Côte d'Ivoire were analyzed. Life-table analysis was conducted to examine respondents' probability of remaining sexually inexperienced according to siblings' history of premarital childbearing. Cox multivariate regressions were used to estimate respondents' relative risks of sexual debut by age 17 and by age 24. At any age between 15 and 24 years, the life-table probability of remaining sexually inexperienced was typically lower among persons who had at least one sibling with a premarital birth than among those who had no such sibling. In general, among those with at least one sibling who had had a premarital birth, the probability was lower if the sibling or siblings and the respondent were of the same gender rather than opposite genders, and the probability was lowest among those who had a brother and a sister with a history of premarital childbearing. In the multivariate analysis for males, having one or more brothers only, or having at least one brother and at least one sister, with a history of premarital childbearing was associated with increased relative risks of being sexually experienced by ages 17 and 24. No such association was found for females. Programs that seek to reduce premarital sexual activity among young people should develop strategies that take into account the potential influence of siblings.

  12. Prevalence of esophageal disorders in patients with recurrent chest pain.

    PubMed

    Manterola, C; Barroso, M S; Losada, H; Muñoz, S; Vial, M

    2004-01-01

    The objective of this study is to determine the prevalence of esophageal disorders (ED) associated with recurrent chest pain (RCP) and the utility of esophageal functional tests (EFT) in the study of these patients. The cross-sectional study was conducted at Hospital Clínico de La Frontera, Chile. One hundred and twenty-three patients with RCP were studied using esophageal manometry, edrophonium stimulation and 24-h pH monitoring. The performance of EFT was considered acceptable when they were capable of finding ED. To state the probability that RCP had an esophageal origin, patients were classified according to whether their pain had a probable, possible or unlikely esophageal origin. The prevalence of ED was determined according to diagnoses obtained after applying EFT and a multivariate analysis was performed to examine the association between the esophageal origin of RCP and ED. Rates of correct diagnosis of 65.9%, 56.9% and 31.7% was verified for 24-h pH monitoring, esophageal manometry and edrophonium stimulation, respectively. In 38.2% of patients with RCP, the pain was probably of esophageal origin, in 42.3% there was a possible esophageal origin and in 19.5% an unlikely esophageal origin. A 44.7% prevalence of GERD, 26.8% of GERD with secondary esophageal motor dysfunction and 8.9% of pure esophageal motor dysfunction were verified. The multivariate analysis allowed us to verify the association between the probability of esophageal origin of RCP, the variables RCP duration, esophagitis and dysphagia coexistence (P= 0.037, P= 0.030 and P= 0.024, respectively), and a statistically significant association between ED and dysphagia coexistence (P= 0.028). A high prevalence of ED was identified in patients with RCP.

  13. Inorganic geochemistry of surface sediments of the Ebro shelf and slope, northwestern Mediterranean

    USGS Publications Warehouse

    Gardner, J.V.; Dean, W.E.; Alonso, B.

    1990-01-01

    Distributions of major, minor, and trace elements in surface sediment of the continental shelf and upper slope of the northeastern Spanish continental margin reflect the influences of discharge from the Ebro River and changes in eustatic sea levels. Multivariate factor analysis of sediment geochemistry was used to identify five groupings of samples (factors) on the shelf and slope. The first factor is an aluminosilicate factor that represents detrital clastic material. The second factor is a highly variable amount of excess SiO2 and probably represents a quartz residuum originating from winnowing of relict detrital sediments. A carbonate factor (Factor 3) has no positive correlation with other geochemical parameters but is associated with the sand-size fraction. The carbonate in these sediments consists of a mixture of biogenic calcite and angular to subangular detrital grains. Organic carbon is associated with the aluminosilicate factor (Factor 1) but also factors out by itself (Factor 4); this suggests that there may be two sources of organic matter, terrestrial and marine. The fifth factor comprises upper slope sediments that contain high concentrations of manganese. The most likely explanation for these high manganese concentrations is precipitation of Mn oxyhydroxides at the interface between Mn-rich, oxygen-deficient, intermediate waters and oxygenated surface waters. During eustatic low sea levels of the glacial Pleistocene, the Ebro Delta built across the outer continental shelf and deposited sediment with fairly high contents of organic carbon and continental components. The period of marine transgression from eustatic low (glacial) to eustatic high (interglacial) sea levels was characterized by erosion of the outer shelf delta and surficial shelf sediments and the transport of sediment across the slope within numerous canyons. Once eustatic high sea level was reached, delta progradation resumed on the inner shelf. Today, coarse-grained sediment (silt and sand) is transported to the continental shelf by Ebro River and is distributed along the inner shelf by currents generated by dominant northeasterly winds. Clay-size material is deposited on the mid- and outer-shelf. However, erosion and delta progradation during the last glacial period, and fine-grained Holocene sedimentation, have probably produced a distribution of sediment on a diachronous surface. ?? 1990.

  14. Classification and assessment of retrieved electron density maps in coherent X-ray diffraction imaging using multivariate analysis.

    PubMed

    Sekiguchi, Yuki; Oroguchi, Tomotaka; Nakasako, Masayoshi

    2016-01-01

    Coherent X-ray diffraction imaging (CXDI) is one of the techniques used to visualize structures of non-crystalline particles of micrometer to submicrometer size from materials and biological science. In the structural analysis of CXDI, the electron density map of a sample particle can theoretically be reconstructed from a diffraction pattern by using phase-retrieval (PR) algorithms. However, in practice, the reconstruction is difficult because diffraction patterns are affected by Poisson noise and miss data in small-angle regions due to the beam stop and the saturation of detector pixels. In contrast to X-ray protein crystallography, in which the phases of diffracted waves are experimentally estimated, phase retrieval in CXDI relies entirely on the computational procedure driven by the PR algorithms. Thus, objective criteria and methods to assess the accuracy of retrieved electron density maps are necessary in addition to conventional parameters monitoring the convergence of PR calculations. Here, a data analysis scheme, named ASURA, is proposed which selects the most probable electron density maps from a set of maps retrieved from 1000 different random seeds for a diffraction pattern. Each electron density map composed of J pixels is expressed as a point in a J-dimensional space. Principal component analysis is applied to describe characteristics in the distribution of the maps in the J-dimensional space. When the distribution is characterized by a small number of principal components, the distribution is classified using the k-means clustering method. The classified maps are evaluated by several parameters to assess the quality of the maps. Using the proposed scheme, structure analysis of a diffraction pattern from a non-crystalline particle is conducted in two stages: estimation of the overall shape and determination of the fine structure inside the support shape. In each stage, the most accurate and probable density maps are objectively selected. The validity of the proposed scheme is examined by application to diffraction data that were obtained from an aggregate of metal particles and a biological specimen at the XFEL facility SACLA using custom-made diffraction apparatus.

  15. Simulating Univariate and Multivariate Burr Type IIII and Type XII Distributions through the Method of L-Moments

    ERIC Educational Resources Information Center

    Pant, Mohan Dev

    2011-01-01

    The Burr families (Type III and Type XII) of distributions are traditionally used in the context of statistical modeling and for simulating non-normal distributions with moment-based parameters (e.g., Skew and Kurtosis). In educational and psychological studies, the Burr families of distributions can be used to simulate extremely asymmetrical and…

  16. p-adic stochastic hidden variable model

    NASA Astrophysics Data System (ADS)

    Khrennikov, Andrew

    1998-03-01

    We propose stochastic hidden variables model in which hidden variables have a p-adic probability distribution ρ(λ) and at the same time conditional probabilistic distributions P(U,λ), U=A,A',B,B', are ordinary probabilities defined on the basis of the Kolmogorov measure-theoretical axiomatics. A frequency definition of p-adic probability is quite similar to the ordinary frequency definition of probability. p-adic frequency probability is defined as the limit of relative frequencies νn but in the p-adic metric. We study a model with p-adic stochastics on the level of the hidden variables description. But, of course, responses of macroapparatuses have to be described by ordinary stochastics. Thus our model describes a mixture of p-adic stochastics of the microworld and ordinary stochastics of macroapparatuses. In this model probabilities for physical observables are the ordinary probabilities. At the same time Bell's inequality is violated.

  17. Stationary properties of maximum-entropy random walks.

    PubMed

    Dixit, Purushottam D

    2015-10-01

    Maximum-entropy (ME) inference of state probabilities using state-dependent constraints is popular in the study of complex systems. In stochastic systems, how state space topology and path-dependent constraints affect ME-inferred state probabilities remains unknown. To that end, we derive the transition probabilities and the stationary distribution of a maximum path entropy Markov process subject to state- and path-dependent constraints. A main finding is that the stationary distribution over states differs significantly from the Boltzmann distribution and reflects a competition between path multiplicity and imposed constraints. We illustrate our results with particle diffusion on a two-dimensional landscape. Connections with the path integral approach to diffusion are discussed.

  18. Estimation and prediction of maximum daily rainfall at Sagar Island using best fit probability models

    NASA Astrophysics Data System (ADS)

    Mandal, S.; Choudhury, B. U.

    2015-07-01

    Sagar Island, setting on the continental shelf of Bay of Bengal, is one of the most vulnerable deltas to the occurrence of extreme rainfall-driven climatic hazards. Information on probability of occurrence of maximum daily rainfall will be useful in devising risk management for sustaining rainfed agrarian economy vis-a-vis food and livelihood security. Using six probability distribution models and long-term (1982-2010) daily rainfall data, we studied the probability of occurrence of annual, seasonal and monthly maximum daily rainfall (MDR) in the island. To select the best fit distribution models for annual, seasonal and monthly time series based on maximum rank with minimum value of test statistics, three statistical goodness of fit tests, viz. Kolmogorove-Smirnov test (K-S), Anderson Darling test ( A 2 ) and Chi-Square test ( X 2) were employed. The fourth probability distribution was identified from the highest overall score obtained from the three goodness of fit tests. Results revealed that normal probability distribution was best fitted for annual, post-monsoon and summer seasons MDR, while Lognormal, Weibull and Pearson 5 were best fitted for pre-monsoon, monsoon and winter seasons, respectively. The estimated annual MDR were 50, 69, 86, 106 and 114 mm for return periods of 2, 5, 10, 20 and 25 years, respectively. The probability of getting an annual MDR of >50, >100, >150, >200 and >250 mm were estimated as 99, 85, 40, 12 and 03 % level of exceedance, respectively. The monsoon, summer and winter seasons exhibited comparatively higher probabilities (78 to 85 %) for MDR of >100 mm and moderate probabilities (37 to 46 %) for >150 mm. For different recurrence intervals, the percent probability of MDR varied widely across intra- and inter-annual periods. In the island, rainfall anomaly can pose a climatic threat to the sustainability of agricultural production and thus needs adequate adaptation and mitigation measures.

  19. Quasi-probabilities in conditioned quantum measurement and a geometric/statistical interpretation of Aharonov's weak value

    NASA Astrophysics Data System (ADS)

    Lee, Jaeha; Tsutsui, Izumi

    2017-05-01

    We show that the joint behavior of an arbitrary pair of (generally noncommuting) quantum observables can be described by quasi-probabilities, which are an extended version of the standard probabilities used for describing the outcome of measurement for a single observable. The physical situations that require these quasi-probabilities arise when one considers quantum measurement of an observable conditioned by some other variable, with the notable example being the weak measurement employed to obtain Aharonov's weak value. Specifically, we present a general prescription for the construction of quasi-joint probability (QJP) distributions associated with a given combination of observables. These QJP distributions are introduced in two complementary approaches: one from a bottom-up, strictly operational construction realized by examining the mathematical framework of the conditioned measurement scheme, and the other from a top-down viewpoint realized by applying the results of the spectral theorem for normal operators and their Fourier transforms. It is then revealed that, for a pair of simultaneously measurable observables, the QJP distribution reduces to the unique standard joint probability distribution of the pair, whereas for a noncommuting pair there exists an inherent indefiniteness in the choice of such QJP distributions, admitting a multitude of candidates that may equally be used for describing the joint behavior of the pair. In the course of our argument, we find that the QJP distributions furnish the space of operators in the underlying Hilbert space with their characteristic geometric structures such that the orthogonal projections and inner products of observables can be given statistical interpretations as, respectively, “conditionings” and “correlations”. The weak value Aw for an observable A is then given a geometric/statistical interpretation as either the orthogonal projection of A onto the subspace generated by another observable B, or equivalently, as the conditioning of A given B with respect to the QJP distribution under consideration.

  20. Multivariate Epi-splines and Evolving Function Identification Problems

    DTIC Science & Technology

    2015-04-15

    such extrinsic information as well as observed function and subgradient values often evolve in applications, we establish conditions under which the...previous study [30] dealt with compact intervals of IR. Splines are intimately tied to optimization problems through their variational theory pioneered...approxima- tion. Motivated by applications in curve fitting, regression, probability density estimation, variogram computation, financial curve construction

  1. Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

    PubMed

    Sztepanacz, Jacqueline L; Blows, Mark W

    2017-07-01

    The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.

  2. Two Universality Properties Associated with the Monkey Model of Zipf's Law

    NASA Astrophysics Data System (ADS)

    Perline, Richard; Perline, Ron

    2016-03-01

    The distribution of word probabilities in the monkey model of Zipf's law is associated with two universality properties: (1) the power law exponent converges strongly to $-1$ as the alphabet size increases and the letter probabilities are specified as the spacings from a random division of the unit interval for any distribution with a bounded density function on $[0,1]$; and (2), on a logarithmic scale the version of the model with a finite word length cutoff and unequal letter probabilities is approximately normally distributed in the part of the distribution away from the tails. The first property is proved using a remarkably general limit theorem for the logarithm of sample spacings from Shao and Hahn, and the second property follows from Anscombe's central limit theorem for a random number of i.i.d. random variables. The finite word length model leads to a hybrid Zipf-lognormal mixture distribution closely related to work in other areas.

  3. Computer routines for probability distributions, random numbers, and related functions

    USGS Publications Warehouse

    Kirby, W.

    1983-01-01

    Use of previously coded and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main progress. The probability distributions provided include the beta, chi-square, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F. Other mathematical functions include the Bessel function, I sub o, gamma and log-gamma functions, error functions, and exponential integral. Auxiliary services include sorting and printer-plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)

  4. Computer routines for probability distributions, random numbers, and related functions

    USGS Publications Warehouse

    Kirby, W.H.

    1980-01-01

    Use of previously codes and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main programs. The probability distributions provided include the beta, chisquare, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F tests. Other mathematical functions include the Bessel function I (subzero), gamma and log-gamma functions, error functions and exponential integral. Auxiliary services include sorting and printer plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)

  5. Universal laws of human society's income distribution

    NASA Astrophysics Data System (ADS)

    Tao, Yong

    2015-10-01

    General equilibrium equations in economics play the same role with many-body Newtonian equations in physics. Accordingly, each solution of the general equilibrium equations can be regarded as a possible microstate of the economic system. Since Arrow's Impossibility Theorem and Rawls' principle of social fairness will provide a powerful support for the hypothesis of equal probability, then the principle of maximum entropy is available in a just and equilibrium economy so that an income distribution will occur spontaneously (with the largest probability). Remarkably, some scholars have observed such an income distribution in some democratic countries, e.g. USA. This result implies that the hypothesis of equal probability may be only suitable for some "fair" systems (economic or physical systems). From this meaning, the non-equilibrium systems may be "unfair" so that the hypothesis of equal probability is unavailable.

  6. Polynomial chaos representation of databases on manifolds

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu

    2017-04-15

    Characterizing the polynomial chaos expansion (PCE) of a vector-valued random variable with probability distribution concentrated on a manifold is a relevant problem in data-driven settings. The probability distribution of such random vectors is multimodal in general, leading to potentially very slow convergence of the PCE. In this paper, we build on a recent development for estimating and sampling from probabilities concentrated on a diffusion manifold. The proposed methodology constructs a PCE of the random vector together with an associated generator that samples from the target probability distribution which is estimated from data concentrated in the neighborhood of the manifold. Themore » method is robust and remains efficient for high dimension and large datasets. The resulting polynomial chaos construction on manifolds permits the adaptation of many uncertainty quantification and statistical tools to emerging questions motivated by data-driven queries.« less

  7. Gravitational lensing, time delay, and gamma-ray bursts

    NASA Technical Reports Server (NTRS)

    Mao, Shude

    1992-01-01

    The probability distributions of time delay in gravitational lensing by point masses and isolated galaxies (modeled as singular isothermal spheres) are studied. For point lenses (all with the same mass) the probability distribution is broad, and with a peak at delta(t) of about 50 S; for singular isothermal spheres, the probability distribution is a rapidly decreasing function with increasing time delay, with a median delta(t) equals about 1/h month, and its behavior depends sensitively on the luminosity function of galaxies. The present simplified calculation is particularly relevant to the gamma-ray bursts if they are of cosmological origin. The frequency of 'recurrent' bursts due to gravitational lensing by galaxies is probably between 0.05 and 0.4 percent. Gravitational lensing can be used as a test of the cosmological origin of gamma-ray bursts.

  8. A general formula for computing maximum proportion correct scores in various psychophysical paradigms with arbitrary probability distributions of stimulus observations.

    PubMed

    Dai, Huanping; Micheyl, Christophe

    2015-05-01

    Proportion correct (Pc) is a fundamental measure of task performance in psychophysics. The maximum Pc score that can be achieved by an optimal (maximum-likelihood) observer in a given task is of both theoretical and practical importance, because it sets an upper limit on human performance. Within the framework of signal detection theory, analytical solutions for computing the maximum Pc score have been established for several common experimental paradigms under the assumption of Gaussian additive internal noise. However, as the scope of applications of psychophysical signal detection theory expands, the need is growing for psychophysicists to compute maximum Pc scores for situations involving non-Gaussian (internal or stimulus-induced) noise. In this article, we provide a general formula for computing the maximum Pc in various psychophysical experimental paradigms for arbitrary probability distributions of sensory activity. Moreover, easy-to-use MATLAB code implementing the formula is provided. Practical applications of the formula are illustrated, and its accuracy is evaluated, for two paradigms and two types of probability distributions (uniform and Gaussian). The results demonstrate that Pc scores computed using the formula remain accurate even for continuous probability distributions, as long as the conversion from continuous probability density functions to discrete probability mass functions is supported by a sufficiently high sampling resolution. We hope that the exposition in this article, and the freely available MATLAB code, facilitates calculations of maximum performance for a wider range of experimental situations, as well as explorations of the impact of different assumptions concerning internal-noise distributions on maximum performance in psychophysical experiments.

  9. Atom counting in HAADF STEM using a statistical model-based approach: methodology, possibilities, and inherent limitations.

    PubMed

    De Backer, A; Martinez, G T; Rosenauer, A; Van Aert, S

    2013-11-01

    In the present paper, a statistical model-based method to count the number of atoms of monotype crystalline nanostructures from high resolution high-angle annular dark-field (HAADF) scanning transmission electron microscopy (STEM) images is discussed in detail together with a thorough study on the possibilities and inherent limitations. In order to count the number of atoms, it is assumed that the total scattered intensity scales with the number of atoms per atom column. These intensities are quantitatively determined using model-based statistical parameter estimation theory. The distribution describing the probability that intensity values are generated by atomic columns containing a specific number of atoms is inferred on the basis of the experimental scattered intensities. Finally, the number of atoms per atom column is quantified using this estimated probability distribution. The number of atom columns available in the observed STEM image, the number of components in the estimated probability distribution, the width of the components of the probability distribution, and the typical shape of a criterion to assess the number of components in the probability distribution directly affect the accuracy and precision with which the number of atoms in a particular atom column can be estimated. It is shown that single atom sensitivity is feasible taking the latter aspects into consideration. © 2013 Elsevier B.V. All rights reserved.

  10. Theoretical size distribution of fossil taxa: analysis of a null model

    PubMed Central

    Reed, William J; Hughes, Barry D

    2007-01-01

    Background This article deals with the theoretical size distribution (of number of sub-taxa) of a fossil taxon arising from a simple null model of macroevolution. Model New species arise through speciations occurring independently and at random at a fixed probability rate, while extinctions either occur independently and at random (background extinctions) or cataclysmically. In addition new genera are assumed to arise through speciations of a very radical nature, again assumed to occur independently and at random at a fixed probability rate. Conclusion The size distributions of the pioneering genus (following a cataclysm) and of derived genera are determined. Also the distribution of the number of genera is considered along with a comparison of the probability of a monospecific genus with that of a monogeneric family. PMID:17376249

  11. Maximum entropy approach to statistical inference for an ocean acoustic waveguide.

    PubMed

    Knobles, D P; Sagers, J D; Koch, R A

    2012-02-01

    A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America

  12. Probability distributions of continuous measurement results for conditioned quantum evolution

    NASA Astrophysics Data System (ADS)

    Franquet, A.; Nazarov, Yuli V.

    2017-02-01

    We address the statistics of continuous weak linear measurement on a few-state quantum system that is subject to a conditioned quantum evolution. For a conditioned evolution, both the initial and final states of the system are fixed: the latter is achieved by the postselection in the end of the evolution. The statistics may drastically differ from the nonconditioned case, and the interference between initial and final states can be observed in the probability distributions of measurement outcomes as well as in the average values exceeding the conventional range of nonconditioned averages. We develop a proper formalism to compute the distributions of measurement outcomes, and evaluate and discuss the distributions in experimentally relevant setups. We demonstrate the manifestations of the interference between initial and final states in various regimes. We consider analytically simple examples of nontrivial probability distributions. We reveal peaks (or dips) at half-quantized values of the measurement outputs. We discuss in detail the case of zero overlap between initial and final states demonstrating anomalously big average outputs and sudden jump in time-integrated output. We present and discuss the numerical evaluation of the probability distribution aiming at extending the analytical results and describing a realistic experimental situation of a qubit in the regime of resonant fluorescence.

  13. Shallow slip amplification and enhanced tsunami hazard unravelled by dynamic simulations of mega-thrust earthquakes

    PubMed Central

    Murphy, S.; Scala, A.; Herrero, A.; Lorito, S.; Festa, G.; Trasatti, E.; Tonini, R.; Romano, F.; Molinari, I.; Nielsen, S.

    2016-01-01

    The 2011 Tohoku earthquake produced an unexpected large amount of shallow slip greatly contributing to the ensuing tsunami. How frequent are such events? How can they be efficiently modelled for tsunami hazard? Stochastic slip models, which can be computed rapidly, are used to explore the natural slip variability; however, they generally do not deal specifically with shallow slip features. We study the systematic depth-dependence of slip along a thrust fault with a number of 2D dynamic simulations using stochastic shear stress distributions and a geometry based on the cross section of the Tohoku fault. We obtain a probability density for the slip distribution, which varies both with depth, earthquake size and whether the rupture breaks the surface. We propose a method to modify stochastic slip distributions according to this dynamically-derived probability distribution. This method may be efficiently applied to produce large numbers of heterogeneous slip distributions for probabilistic tsunami hazard analysis. Using numerous M9 earthquake scenarios, we demonstrate that incorporating the dynamically-derived probability distribution does enhance the conditional probability of exceedance of maximum estimated tsunami wave heights along the Japanese coast. This technique for integrating dynamic features in stochastic models can be extended to any subduction zone and faulting style. PMID:27725733

  14. A program for the Bayesian Neural Network in the ROOT framework

    NASA Astrophysics Data System (ADS)

    Zhong, Jiahang; Huang, Run-Sheng; Lee, Shih-Chang

    2011-12-01

    We present a Bayesian Neural Network algorithm implemented in the TMVA package (Hoecker et al., 2007 [1]), within the ROOT framework (Brun and Rademakers, 1997 [2]). Comparing to the conventional utilization of Neural Network as discriminator, this new implementation has more advantages as a non-parametric regression tool, particularly for fitting probabilities. It provides functionalities including cost function selection, complexity control and uncertainty estimation. An example of such application in High Energy Physics is shown. The algorithm is available with ROOT release later than 5.29. Program summaryProgram title: TMVA-BNN Catalogue identifier: AEJX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: BSD license No. of lines in distributed program, including test data, etc.: 5094 No. of bytes in distributed program, including test data, etc.: 1,320,987 Distribution format: tar.gz Programming language: C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system Operating system: Most UNIX/Linux systems. The application programs were thoroughly tested under Fedora and Scientific Linux CERN. Classification: 11.9 External routines: ROOT package version 5.29 or higher ( http://root.cern.ch) Nature of problem: Non-parametric fitting of multivariate distributions Solution method: An implementation of Neural Network following the Bayesian statistical interpretation. Uses Laplace approximation for the Bayesian marginalizations. Provides the functionalities of automatic complexity control and uncertainty estimation. Running time: Time consumption for the training depends substantially on the size of input sample, the NN topology, the number of training iterations, etc. For the example in this manuscript, about 7 min was used on a PC/Linux with 2.0 GHz processors.

  15. A discussion on the origin of quantum probabilities

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holik, Federico, E-mail: olentiev2@gmail.com; Departamento de Matemática - Ciclo Básico Común, Universidad de Buenos Aires - Pabellón III, Ciudad Universitaria, Buenos Aires; Sáenz, Manuel

    We study the origin of quantum probabilities as arising from non-Boolean propositional-operational structures. We apply the method developed by Cox to non distributive lattices and develop an alternative formulation of non-Kolmogorovian probability measures for quantum mechanics. By generalizing the method presented in previous works, we outline a general framework for the deduction of probabilities in general propositional structures represented by lattices (including the non-distributive case). -- Highlights: •Several recent works use a derivation similar to that of R.T. Cox to obtain quantum probabilities. •We apply Cox’s method to the lattice of subspaces of the Hilbert space. •We obtain a derivationmore » of quantum probabilities which includes mixed states. •The method presented in this work is susceptible to generalization. •It includes quantum mechanics and classical mechanics as particular cases.« less

  16. Probability Weighting Functions Derived from Hyperbolic Time Discounting: Psychophysical Models and Their Individual Level Testing.

    PubMed

    Takemura, Kazuhisa; Murakami, Hajime

    2016-01-01

    A probability weighting function (w(p)) is considered to be a nonlinear function of probability (p) in behavioral decision theory. This study proposes a psychophysical model of probability weighting functions derived from a hyperbolic time discounting model and a geometric distribution. The aim of the study is to show probability weighting functions from the point of view of waiting time for a decision maker. Since the expected value of a geometrically distributed random variable X is 1/p, we formulized the probability weighting function of the expected value model for hyperbolic time discounting as w(p) = (1 - k log p)(-1). Moreover, the probability weighting function is derived from Loewenstein and Prelec's (1992) generalized hyperbolic time discounting model. The latter model is proved to be equivalent to the hyperbolic-logarithmic weighting function considered by Prelec (1998) and Luce (2001). In this study, we derive a model from the generalized hyperbolic time discounting model assuming Fechner's (1860) psychophysical law of time and a geometric distribution of trials. In addition, we develop median models of hyperbolic time discounting and generalized hyperbolic time discounting. To illustrate the fitness of each model, a psychological experiment was conducted to assess the probability weighting and value functions at the level of the individual participant. The participants were 50 university students. The results of individual analysis indicated that the expected value model of generalized hyperbolic discounting fitted better than previous probability weighting decision-making models. The theoretical implications of this finding are discussed.

  17. Bootstrap imputation with a disease probability model minimized bias from misclassification due to administrative database codes.

    PubMed

    van Walraven, Carl

    2017-04-01

    Diagnostic codes used in administrative databases cause bias due to misclassification of patient disease status. It is unclear which methods minimize this bias. Serum creatinine measures were used to determine severe renal failure status in 50,074 hospitalized patients. The true prevalence of severe renal failure and its association with covariates were measured. These were compared to results for which renal failure status was determined using surrogate measures including the following: (1) diagnostic codes; (2) categorization of probability estimates of renal failure determined from a previously validated model; or (3) bootstrap methods imputation of disease status using model-derived probability estimates. Bias in estimates of severe renal failure prevalence and its association with covariates were minimal when bootstrap methods were used to impute renal failure status from model-based probability estimates. In contrast, biases were extensive when renal failure status was determined using codes or methods in which model-based condition probability was categorized. Bias due to misclassification from inaccurate diagnostic codes can be minimized using bootstrap methods to impute condition status using multivariable model-derived probability estimates. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Hybrid Approaches and Industrial Applications of Pattern Recognition,

    DTIC Science & Technology

    1980-10-01

    emphasized that the probability distribution in (9) is correct only under the assumption that P( wIx ) is known exactly. In practice this assumption will...sufficient precision. The alternative would be to take the probability distribution of estimates of P( wix ) into account in the analysis. However, from the

  19. Generalized Success-Breeds-Success Principle Leading to Time-Dependent Informetric Distributions.

    ERIC Educational Resources Information Center

    Egghe, Leo; Rousseau, Ronald

    1995-01-01

    Reformulates the success-breeds-success (SBS) principle in informetrics in order to generate a general theory of source-item relationships. Topics include a time-dependent probability, a new model for the expected probability that is compared with the SBS principle with exact combinatorial calculations, classical frequency distributions, and…

  20. The beta distribution: A statistical model for world cloud cover

    NASA Technical Reports Server (NTRS)

    Falls, L. W.

    1973-01-01

    Much work has been performed in developing empirical global cloud cover models. This investigation was made to determine an underlying theoretical statistical distribution to represent worldwide cloud cover. The beta distribution with probability density function is given to represent the variability of this random variable. It is shown that the beta distribution possesses the versatile statistical characteristics necessary to assume the wide variety of shapes exhibited by cloud cover. A total of 160 representative empirical cloud cover distributions were investigated and the conclusion was reached that this study provides sufficient statical evidence to accept the beta probability distribution as the underlying model for world cloud cover.

  1. Gas Hydrate Formation Probability Distributions: The Effect of Shear and Comparisons with Nucleation Theory.

    PubMed

    May, Eric F; Lim, Vincent W; Metaxas, Peter J; Du, Jianwei; Stanwix, Paul L; Rowland, Darren; Johns, Michael L; Haandrikman, Gert; Crosby, Daniel; Aman, Zachary M

    2018-03-13

    Gas hydrate formation is a stochastic phenomenon of considerable significance for any risk-based approach to flow assurance in the oil and gas industry. In principle, well-established results from nucleation theory offer the prospect of predictive models for hydrate formation probability in industrial production systems. In practice, however, heuristics are relied on when estimating formation risk for a given flowline subcooling or when quantifying kinetic hydrate inhibitor (KHI) performance. Here, we present statistically significant measurements of formation probability distributions for natural gas hydrate systems under shear, which are quantitatively compared with theoretical predictions. Distributions with over 100 points were generated using low-mass, Peltier-cooled pressure cells, cycled in temperature between 40 and -5 °C at up to 2 K·min -1 and analyzed with robust algorithms that automatically identify hydrate formation and initial growth rates from dynamic pressure data. The application of shear had a significant influence on the measured distributions: at 700 rpm mass-transfer limitations were minimal, as demonstrated by the kinetic growth rates observed. The formation probability distributions measured at this shear rate had mean subcoolings consistent with theoretical predictions and steel-hydrate-water contact angles of 14-26°. However, the experimental distributions were substantially wider than predicted, suggesting that phenomena acting on macroscopic length scales are responsible for much of the observed stochastic formation. Performance tests of a KHI provided new insights into how such chemicals can reduce the risk of hydrate blockage in flowlines. Our data demonstrate that the KHI not only reduces the probability of formation (by both shifting and sharpening the distribution) but also reduces hydrate growth rates by a factor of 2.

  2. Quantum work in the Bohmian framework

    NASA Astrophysics Data System (ADS)

    Sampaio, R.; Suomela, S.; Ala-Nissila, T.; Anders, J.; Philbin, T. G.

    2018-01-01

    At nonzero temperature classical systems exhibit statistical fluctuations of thermodynamic quantities arising from the variation of the system's initial conditions and its interaction with the environment. The fluctuating work, for example, is characterized by the ensemble of system trajectories in phase space and, by including the probabilities for various trajectories to occur, a work distribution can be constructed. However, without phase-space trajectories, the task of constructing a work probability distribution in the quantum regime has proven elusive. Here we use quantum trajectories in phase space and define fluctuating work as power integrated along the trajectories, in complete analogy to classical statistical physics. The resulting work probability distribution is valid for any quantum evolution, including cases with coherences in the energy basis. We demonstrate the quantum work probability distribution and its properties with an exactly solvable example of a driven quantum harmonic oscillator. An important feature of the work distribution is its dependence on the initial statistical mixture of pure states, which is reflected in higher moments of the work. The proposed approach introduces a fundamentally different perspective on quantum thermodynamics, allowing full thermodynamic characterization of the dynamics of quantum systems, including the measurement process.

  3. Combining contamination indexes, sediment quality guidelines and multivariate data analysis for metal pollution assessment in marine sediments of Cienfuegos Bay, Cuba.

    PubMed

    Peña-Icart, Mirella; Pereira-Filho, Edenir Rodrigues; Lopes Fialho, Lucimar; Nóbrega, Joaquim A; Alonso-Hernández, Carlos; Bolaños-Alvarez, Yoelvis; Pomares-Alfonso, Mario S

    2017-02-01

    The purpose of the present work was to combine several tools for assessing metal pollution in marine sediments from Cienfuegos Bay. Fourteen surface sediments collected in 2013 were evaluated. Concentrations of As, Cu, Ni, Zn and V decreased respect to those previous reported. The metal contamination was spatially distributed in the north and south parts of the bay. According to the contamination factor (CF) enrichment factor (EF) and index of geoaccumulation (I geo ), Cd and Cu were classified in that order as the most contaminated elements in most sediment. Comparison of the total metal concentrations with the threshold (TELs) and probable (PELs) effect levels in sediment quality guidelines suggested a more worrisome situation for Cu, of which concentrations were occasional associated with adverse biological effects in thirteen sediments, followed by Ni in nine sediments; while adverse effects were rarely associated with Cd. Probably, Cu could be considered as the most dangerous in the whole bay because it was classified in the high contamination levels by all indexes and, simultaneously, associated to occasional adverse effects in most samples. Despite the bioavailability was partially evaluated with the HCl method, the low extraction of Ni (<3% in all samples) and Cu (<55%, except sample 3) and the relative high extraction of Cd (50% or more, except sample 14) could be considered as an attenuating (Ni and Cu) or increasing (Cd) factor in the risk assessment of those element. Copyright © 2016. Published by Elsevier Ltd.

  4. [Influence of individual characteristics and working conditions in the level of injury accident at work by registered in Andalusia, Spain, in 2003].

    PubMed

    Muñoz, Julia Bolívar; Codina, Antonio Daponte; Cruz, Laura López; Rodríguez, Inmaculada Mateo

    2009-01-01

    The study of the severity of occupational injuries is very important for the establishment of prevention plans. The aim of this paper is to analyze the distribution of occupational injuries by a) individual factors b) work place characteristics and c) working conditions and to analyze the severity of occupational injuries by this characteristics in men and women in Andalusia. Injury data came from the accident registry of the Ministry of Labor and social issues in 2003. Dependent variable: the severity of the injury: slight, serious, very serious and fatal; the independent variables: the characteristics of the worker, company data, and the accident itself. Bivariate and multivariate analysis were done to estimate the probability of serious, very serious and fatal injury, related to other variables, through odds ratio (OR), and using a 95% confidence interval (CI 95%). The 82.4% of the records were men and 17.6% were women, of whom the 78,1% are unskilled manual workers, compared to 44.9% of men. The men belonging to class I have a higher probability of more severe lesions (OR = 1.67, 95% CI = 1.17-2.38). The severity of the injury is associated with sex, age and type of injury. In men it is also related with the professional situation, the place where the accident happened, an unusual job, the size and the characteristics of the company and the social class, and in women with the sector.

  5. Global assessment of surfing conditions: seasonal, interannual and long-term variability

    NASA Astrophysics Data System (ADS)

    Espejo, A.; Losada, I.; Mendez, F.

    2012-12-01

    International surfing destinations owe a great debt to specific combinations of wind-wave, thermal conditions and local bathymetry. As surf quality depends on a vast number of geophysical variables, a multivariable standardized index on the basis of expert judgment is proposed to analyze surf resource in a worldwide domain. Data needed is obtained by combining several datasets (reanalyses): 60-year satellite-calibrated spectral wave hindcast (GOW, WaveWatchIII), wind fields from NCEP/NCAR, global sea surface temperature from ERSST.v3b, and global tides from TPXO7.1. A summary of the global surf resource is presented, which highlights the high degree of variability in surfable events. According to general atmospheric circulation, results show that west facing low to middle latitude coasts are more suitable for surfing, especially those in Southern Hemisphere. Month to month analysis reveals strong seasonal changes in the occurrence of surfable events, enhancing those in North Atlantic or North Pacific. Interannual variability is investigated by comparing occurrence values with global and regional climate patterns showing a great influence at both, global and regional scales. Analysis of long term trends shows an increase in the probability of surfable events over the west facing coasts on the planet (i.e. + 30 hours/year in California). The resulting maps provide useful information for surfers and surf related stakeholders, coastal planning, education, and basic research.; Figure 1. Global distribution of medium quality (a) and high quality surf conditions probability (b).

  6. Applying the log-normal distribution to target detection

    NASA Astrophysics Data System (ADS)

    Holst, Gerald C.

    1992-09-01

    Holst and Pickard experimentally determined that MRT responses tend to follow a log-normal distribution. The log normal distribution appeared reasonable because nearly all visual psychological data is plotted on a logarithmic scale. It has the additional advantage that it is bounded to positive values; an important consideration since probability of detection is often plotted in linear coordinates. Review of published data suggests that the log-normal distribution may have universal applicability. Specifically, the log-normal distribution obtained from MRT tests appears to fit the target transfer function and the probability of detection of rectangular targets.

  7. Tsunami Size Distributions at Far-Field Locations from Aggregated Earthquake Sources

    NASA Astrophysics Data System (ADS)

    Geist, E. L.; Parsons, T.

    2015-12-01

    The distribution of tsunami amplitudes at far-field tide gauge stations is explained by aggregating the probability of tsunamis derived from individual subduction zones and scaled by their seismic moment. The observed tsunami amplitude distributions of both continental (e.g., San Francisco) and island (e.g., Hilo) stations distant from subduction zones are examined. Although the observed probability distributions nominally follow a Pareto (power-law) distribution, there are significant deviations. Some stations exhibit varying degrees of tapering of the distribution at high amplitudes and, in the case of the Hilo station, there is a prominent break in slope on log-log probability plots. There are also differences in the slopes of the observed distributions among stations that can be significant. To explain these differences we first estimate seismic moment distributions of observed earthquakes for major subduction zones. Second, regression models are developed that relate the tsunami amplitude at a station to seismic moment at a subduction zone, correcting for epicentral distance. The seismic moment distribution is then transformed to a site-specific tsunami amplitude distribution using the regression model. Finally, a mixture distribution is developed, aggregating the transformed tsunami distributions from all relevant subduction zones. This mixture distribution is compared to the observed distribution to assess the performance of the method described above. This method allows us to estimate the largest tsunami that can be expected in a given time period at a station.

  8. Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities

    USGS Publications Warehouse

    Asquith, William H.; Kiang, Julie E.; Cohn, Timothy A.

    2017-07-17

    The U.S. Geological Survey (USGS), in cooperation with the U.S. Nuclear Regulatory Commission, has investigated statistical methods for probabilistic flood hazard assessment to provide guidance on very low annual exceedance probability (AEP) estimation of peak-streamflow frequency and the quantification of corresponding uncertainties using streamgage-specific data. The term “very low AEP” implies exceptionally rare events defined as those having AEPs less than about 0.001 (or 1 × 10–3 in scientific notation or for brevity 10–3). Such low AEPs are of great interest to those involved with peak-streamflow frequency analyses for critical infrastructure, such as nuclear power plants. Flood frequency analyses at streamgages are most commonly based on annual instantaneous peak streamflow data and a probability distribution fit to these data. The fitted distribution provides a means to extrapolate to very low AEPs. Within the United States, the Pearson type III probability distribution, when fit to the base-10 logarithms of streamflow, is widely used, but other distribution choices exist. The USGS-PeakFQ software, implementing the Pearson type III within the Federal agency guidelines of Bulletin 17B (method of moments) and updates to the expected moments algorithm (EMA), was specially adapted for an “Extended Output” user option to provide estimates at selected AEPs from 10–3 to 10–6. Parameter estimation methods, in addition to product moments and EMA, include L-moments, maximum likelihood, and maximum product of spacings (maximum spacing estimation). This study comprehensively investigates multiple distributions and parameter estimation methods for two USGS streamgages (01400500 Raritan River at Manville, New Jersey, and 01638500 Potomac River at Point of Rocks, Maryland). The results of this study specifically involve the four methods for parameter estimation and up to nine probability distributions, including the generalized extreme value, generalized log-normal, generalized Pareto, and Weibull. Uncertainties in streamflow estimates for corresponding AEP are depicted and quantified as two primary forms: quantile (aleatoric [random sampling] uncertainty) and distribution-choice (epistemic [model] uncertainty). Sampling uncertainties of a given distribution are relatively straightforward to compute from analytical or Monte Carlo-based approaches. Distribution-choice uncertainty stems from choices of potentially applicable probability distributions for which divergence among the choices increases as AEP decreases. Conventional goodness-of-fit statistics, such as Cramér-von Mises, and L-moment ratio diagrams are demonstrated in order to hone distribution choice. The results generally show that distribution choice uncertainty is larger than sampling uncertainty for very low AEP values.

  9. Polynomial probability distribution estimation using the method of moments

    PubMed Central

    Mattsson, Lars; Rydén, Jesper

    2017-01-01

    We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram–Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation. PMID:28394949

  10. Polynomial probability distribution estimation using the method of moments.

    PubMed

    Munkhammar, Joakim; Mattsson, Lars; Rydén, Jesper

    2017-01-01

    We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram-Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation.

  11. The Spiral of Life

    NASA Astrophysics Data System (ADS)

    Cajiao Vélez, F.; Kamiński, J. Z.; Krajewska, K.

    2018-04-01

    High-energy photoionization driven by short and circularly-polarized laser pulses is studied in the framework of the relativistic strong-field approximation. The saddle-point analysis of the integrals defining the probability amplitude is used to determine the general properties of the probability distributions. Additionally, an approximate solution to the saddle-point equation is derived. This leads to the concept of the three-dimensional spiral of life in momentum space, around which the ionization probability distribution is maximum. We demonstrate that such spiral is also obtained from a classical treatment.

  12. Count data, detection probabilities, and the demography, dynamics, distribution, and decline of amphibians.

    PubMed

    Schmidt, Benedikt R

    2003-08-01

    The evidence for amphibian population declines is based on count data that were not adjusted for detection probabilities. Such data are not reliable even when collected using standard methods. The formula C = Np (where C is a count, N the true parameter value, and p is a detection probability) relates count data to demography, population size, or distributions. With unadjusted count data, one assumes a linear relationship between C and N and that p is constant. These assumptions are unlikely to be met in studies of amphibian populations. Amphibian population data should be based on methods that account for detection probabilities.

  13. Bayesian data analysis tools for atomic physics

    NASA Astrophysics Data System (ADS)

    Trassinelli, Martino

    2017-10-01

    We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.

  14. Sandpile-based model for capturing magnitude distributions and spatiotemporal clustering and separation in regional earthquakes

    NASA Astrophysics Data System (ADS)

    Batac, Rene C.; Paguirigan, Antonino A., Jr.; Tarun, Anjali B.; Longjas, Anthony G.

    2017-04-01

    We propose a cellular automata model for earthquake occurrences patterned after the sandpile model of self-organized criticality (SOC). By incorporating a single parameter describing the probability to target the most susceptible site, the model successfully reproduces the statistical signatures of seismicity. The energy distributions closely follow power-law probability density functions (PDFs) with a scaling exponent of around -1. 6, consistent with the expectations of the Gutenberg-Richter (GR) law, for a wide range of the targeted triggering probability values. Additionally, for targeted triggering probabilities within the range 0.004-0.007, we observe spatiotemporal distributions that show bimodal behavior, which is not observed previously for the original sandpile. For this critical range of values for the probability, model statistics show remarkable comparison with long-period empirical data from earthquakes from different seismogenic regions. The proposed model has key advantages, the foremost of which is the fact that it simultaneously captures the energy, space, and time statistics of earthquakes by just introducing a single parameter, while introducing minimal parameters in the simple rules of the sandpile. We believe that the critical targeting probability parameterizes the memory that is inherently present in earthquake-generating regions.

  15. Predictive modeling of EEG time series for evaluating surgery targets in epilepsy patients.

    PubMed

    Steimer, Andreas; Müller, Michael; Schindler, Kaspar

    2017-05-01

    During the last 20 years, predictive modeling in epilepsy research has largely been concerned with the prediction of seizure events, whereas the inference of effective brain targets for resective surgery has received surprisingly little attention. In this exploratory pilot study, we describe a distributional clustering framework for the modeling of multivariate time series and use it to predict the effects of brain surgery in epilepsy patients. By analyzing the intracranial EEG, we demonstrate how patients who became seizure free after surgery are clearly distinguished from those who did not. More specifically, for 5 out of 7 patients who obtained seizure freedom (= Engel class I) our method predicts the specific collection of brain areas that got actually resected during surgery to yield a markedly lower posterior probability for the seizure related clusters, when compared to the resection of random or empty collections. Conversely, for 4 out of 5 Engel class III/IV patients who still suffer from postsurgical seizures, performance of the actually resected collection is not significantly better than performances displayed by random or empty collections. As the number of possible collections ranges into billions and more, this is a substantial contribution to a problem that today is still solved by visual EEG inspection. Apart from epilepsy research, our clustering methodology is also of general interest for the analysis of multivariate time series and as a generative model for temporally evolving functional networks in the neurosciences and beyond. Hum Brain Mapp 38:2509-2531, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  16. Characterising RNA secondary structure space using information entropy

    PubMed Central

    2013-01-01

    Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/. PMID:23368905

  17. The probability distribution of side-chain conformations in [Leu] and [Met]enkephalin determines the potency and selectivity to mu and delta opiate receptors.

    PubMed

    Nielsen, Bjørn G; Jensen, Morten Ø; Bohr, Henrik G

    2003-01-01

    The structure of enkephalin, a small neuropeptide with five amino acids, has been simulated on computers using molecular dynamics. Such simulations exhibit a few stable conformations, which also have been identified experimentally. The simulations provide the possibility to perform cluster analysis in the space defined by potentially pharmacophoric measures such as dihedral angles, side-chain orientation, etc. By analyzing the statistics of the resulting clusters, the probability distribution of the side-chain conformations may be determined. These probabilities allow us to predict the selectivity of [Leu]enkephalin and [Met]enkephalin to the known mu- and delta-type opiate receptors to which they bind as agonists. Other plausible consequences of these probability distributions are discussed in relation to the way in which they may influence the dynamics of the synapse. Copyright 2003 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 71: 577-592, 2003

  18. SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahlfeld, R., E-mail: r.ahlfeld14@imperial.ac.uk; Belkouchi, B.; Montomoli, F.

    2016-09-01

    A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrixmore » is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.« less

  19. Modeling Compound Flood Hazards in Coastal Embayments

    NASA Astrophysics Data System (ADS)

    Moftakhari, H.; Schubert, J. E.; AghaKouchak, A.; Luke, A.; Matthew, R.; Sanders, B. F.

    2017-12-01

    Coastal cities around the world are built on lowland topography adjacent to coastal embayments and river estuaries, where multiple factors threaten increasing flood hazards (e.g. sea level rise and river flooding). Quantitative risk assessment is required for administration of flood insurance programs and the design of cost-effective flood risk reduction measures. This demands a characterization of extreme water levels such as 100 and 500 year return period events. Furthermore, hydrodynamic flood models are routinely used to characterize localized flood level intensities (i.e., local depth and velocity) based on boundary forcing sampled from extreme value distributions. For example, extreme flood discharges in the U.S. are estimated from measured flood peaks using the Log-Pearson Type III distribution. However, configuring hydrodynamic models for coastal embayments is challenging because of compound extreme flood events: events caused by a combination of extreme sea levels, extreme river discharges, and possibly other factors such as extreme waves and precipitation causing pluvial flooding in urban developments. Here, we present an approach for flood risk assessment that coordinates multivariate extreme analysis with hydrodynamic modeling of coastal embayments. First, we evaluate the significance of correlation structure between terrestrial freshwater inflow and oceanic variables; second, this correlation structure is described using copula functions in unit joint probability domain; and third, we choose a series of compound design scenarios for hydrodynamic modeling based on their occurrence likelihood. The design scenarios include the most likely compound event (with the highest joint probability density), preferred marginal scenario and reproduced time series of ensembles based on Monte Carlo sampling of bivariate hazard domain. The comparison between resulting extreme water dynamics under the compound hazard scenarios explained above provides an insight to the strengths/weaknesses of each approach and helps modelers choose the appropriate scenario that best fit to the needs of their project. The proposed risk assessment approach can help flood hazard modeling practitioners achieve a more reliable estimate of risk, by cautiously reducing the dimensionality of the hazard analysis.

  20. SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

    NASA Astrophysics Data System (ADS)

    Ahlfeld, R.; Belkouchi, B.; Montomoli, F.

    2016-09-01

    A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrix is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.

  1. Exact probability distribution function for the volatility of cumulative production

    NASA Astrophysics Data System (ADS)

    Zadourian, Rubina; Klümper, Andreas

    2018-04-01

    In this paper we study the volatility and its probability distribution function for the cumulative production based on the experience curve hypothesis. This work presents a generalization of the study of volatility in Lafond et al. (2017), which addressed the effects of normally distributed noise in the production process. Due to its wide applicability in industrial and technological activities we present here the mathematical foundation for an arbitrary distribution function of the process, which we expect will pave the future research on forecasting of the production process.

  2. Distribution of the Determinant of the Sample Correlation Matrix: Monte Carlo Type One Error Rates.

    ERIC Educational Resources Information Center

    Reddon, John R.; And Others

    1985-01-01

    Computer sampling from a multivariate normal spherical population was used to evaluate the type one error rates for a test of sphericity based on the distribution of the determinant of the sample correlation matrix. (Author/LMO)

  3. Statistics of intensity in adaptive-optics images and their usefulness for detection and photometry of exoplanets.

    PubMed

    Gladysz, Szymon; Yaitskova, Natalia; Christou, Julian C

    2010-11-01

    This paper is an introduction to the problem of modeling the probability density function of adaptive-optics speckle. We show that with the modified Rician distribution one cannot describe the statistics of light on axis. A dual solution is proposed: the modified Rician distribution for off-axis speckle and gamma-based distribution for the core of the point spread function. From these two distributions we derive optimal statistical discriminators between real sources and quasi-static speckles. In the second part of the paper the morphological difference between the two probability density functions is used to constrain a one-dimensional, "blind," iterative deconvolution at the position of an exoplanet. Separation of the probability density functions of signal and speckle yields accurate differential photometry in our simulations of the SPHERE planet finder instrument.

  4. Small-Scale Spatio-Temporal Distribution of Bactrocera minax (Enderlein) (Diptera: Tephritidae) Using Probability Kriging.

    PubMed

    Wang, S Q; Zhang, H Y; Li, Z L

    2016-10-01

    Understanding spatio-temporal distribution of pest in orchards can provide important information that could be used to design monitoring schemes and establish better means for pest control. In this study, the spatial and temporal distribution of Bactrocera minax (Enderlein) (Diptera: Tephritidae) was assessed, and activity trends were evaluated by using probability kriging. Adults of B. minax were captured in two successive occurrences in a small-scale citrus orchard by using food bait traps, which were placed both inside and outside the orchard. The weekly spatial distribution of B. minax within the orchard and adjacent woods was examined using semivariogram parameters. The edge concentration was discovered during the most weeks in adult occurrence, and the population of the adults aggregated with high probability within a less-than-100-m-wide band on both of the sides of the orchard and the woods. The sequential probability kriged maps showed that the adults were estimated in the marginal zone with higher probability, especially in the early and peak stages. The feeding, ovipositing, and mating behaviors of B. minax are possible explanations for these spatio-temporal patterns. Therefore, spatial arrangement and distance to the forest edge of traps or spraying spot should be considered to enhance pest control on B. minax in small-scale orchards.

  5. Assessing socioeconomic inequalities of hypertension among women in Indonesia's major cities.

    PubMed

    Christiani, Y; Byles, J E; Tavener, M; Dugdale, P

    2015-11-01

    Although hypertension has been recognized as one of the major public health problems, few studies address economic inequality of hypertension among urban women in developing countries. To assess this issue, we analysed data for 1400 women from four of Indonesia's major cities: Jakarta, Surabaya, Medan and Bandung. Women were aged ⩾15 years (mean age 35.4 years), and were participants in the 2007/2008 Indonesia Family Life Survey. The prevalence of hypertension measured by digital sphygmomanometer among this population was 31%. Using a multivariable logistic regression model, socioeconomic disadvantage (based on household assets and characteristics) as well as age, body mass index and economic conditions were significantly associated with hypertension (P<0.05). Applying the Fairlie decomposition model, results showed that 14% of the inequality between less and more economically advantaged groups could be accounted for by the distribution of socioeconomic characteristics. Education was the strongest contributor to inequality, with lower education levels increasing the predicted probability of hypertension among less economically advantaged groups. This work highlights the importance of socioeconomic inequality in the development of hypertension, and particularly the effects of education level.

  6. HMO selection and Medicare costs: Bayesian MCMC estimation of a robust panel data tobit model with survival.

    PubMed

    Hamilton, B H

    1999-08-01

    The fraction of US Medicare recipients enrolled in health maintenance organizations (HMOs) has increased substantially over the past 10 years. However, the impact of HMOs on health care costs is still hotly debated. In particular, it is argued that HMOs achieve cost reduction through 'cream-skimming' and enrolling relatively healthy patients. This paper develops a Bayesian panel data tobit model of HMO selection and Medicare expenditures for recent US retirees that accounts for mortality over the course of the panel. The model is estimated using Markov Chain Monte Carlo (MCMC) simulation methods, and is novel in that a multivariate t-link is used in place of normality to allow for the heavy-tailed distributions often found in health care expenditure data. The findings indicate that HMOs select individuals who are less likely to have positive health care expenditures prior to enrollment. However, there is no evidence that HMOs disenrol high cost patients. The results also indicate the importance of accounting for survival over the panel, since high mortality probabilities are associated with higher health care expenditures in the last year of life.

  7. Salmonella enterica serovar Enteritidis phage type 4 outbreak associated with eggs in a large prison, London 2009: an investigation using cohort and case/non-case study methodology.

    PubMed

    Davies, A R; Ruggles, R; Young, Y; Clark, H; Reddell, P; Verlander, N Q; Arnold, A; Maguire, H

    2013-05-01

    In September 2009, an outbreak of Salmonella enterica serovar Enteritidis affected 327 of 1419 inmates at a London prison. We applied a cohort design using aggregated data from the kitchen about portions of food distributed, aligned this with individual food histories from 124 cases (18 confirmed, 106 probable) and deduced the exposures of those remaining well. Results showed that prisoners eating egg cress rolls were 26 times more likely to be ill [risk ratio 25.7, 95% confidence interval (CI) 15.5-42.8, P<0.001]. In a case/non-case multivariable analysis the adjusted odds ratio for egg cress rolls was 41.1 (95% CI 10.3-249.7, P<0.001). The epidemiological investigation was strengthened by environmental and microbiological investigations. This paper outlines an approach to investigations in large complex settings where aggregate data for exposures may be available, and led to the development of guidelines for the management of future gastrointestinal outbreaks in prison settings.

  8. Atomic-scale phase composition through multivariate statistical analysis of atom probe tomography data.

    PubMed

    Keenan, Michael R; Smentkowski, Vincent S; Ulfig, Robert M; Oltman, Edward; Larson, David J; Kelly, Thomas F

    2011-06-01

    We demonstrate for the first time that multivariate statistical analysis techniques can be applied to atom probe tomography data to estimate the chemical composition of a sample at the full spatial resolution of the atom probe in three dimensions. Whereas the raw atom probe data provide the specific identity of an atom at a precise location, the multivariate results can be interpreted in terms of the probabilities that an atom representing a particular chemical phase is situated there. When aggregated to the size scale of a single atom (∼0.2 nm), atom probe spectral-image datasets are huge and extremely sparse. In fact, the average spectrum will have somewhat less than one total count per spectrum due to imperfect detection efficiency. These conditions, under which the variance in the data is completely dominated by counting noise, test the limits of multivariate analysis, and an extensive discussion of how to extract the chemical information is presented. Efficient numerical approaches to performing principal component analysis (PCA) on these datasets, which may number hundreds of millions of individual spectra, are put forward, and it is shown that PCA can be computed in a few seconds on a typical laptop computer.

  9. Copula-based prediction of economic movements

    NASA Astrophysics Data System (ADS)

    García, J. E.; González-López, V. A.; Hirsh, I. D.

    2016-06-01

    In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.

  10. Spatial Distribution of a Large Herbivore Community at Waterholes: An Assessment of Its Stability over Years in Hwange National Park, Zimbabwe.

    PubMed

    Chamaillé-Jammes, Simon; Charbonnel, Anaïs; Dray, Stéphane; Madzikanda, Hillary; Fritz, Hervé

    2016-01-01

    The spatial structuring of populations or communities is an important driver of their functioning and their influence on ecosystems. Identifying the (in)stability of the spatial structure of populations is a first step towards understanding the underlying causes of these structures. Here we studied the relative importance of spatial vs. interannual variability in explaining the patterns of abundance of a large herbivore community (8 species) at waterholes in Hwange National Park (Zimbabwe). We analyzed census data collected over 13 years using multivariate methods. Our results showed that variability in the census data was mostly explained by the spatial structure of the community, as some waterholes had consistently greater herbivore abundance than others. Some temporal variability probably linked to Park-scale migration dependent on annual rainfall was noticeable, however. Once this was accounted for, little temporal variability remained to be explained, suggesting that other factors affecting herbivore abundance over time had a negligible effect at the scale of the study. The extent of spatial and temporal variability in census data was also measured for each species. This study could help in projecting the consequences of surface water management, and more generally presents a methodological framework to simultaneously address the relative importance of spatial vs. temporal effects in driving the distribution of organisms across landscapes.

  11. A classification of freshwater Louisiana lakes based on water quality and user perception data.

    PubMed

    Burden, D G; Malone, R F

    1987-09-01

    An index system developed for Louisiana lakes was based on correlations between measurable water quality parameters and perceived lake quality. Support data was provided by an extensive monitoring program of 30 lakes coordinated with opinion surveys undertaken during summer 1984. Lakes included in the survey ranged from 4 to 735 km(2) in surface area with mean depths ranging from 0.5 to 8.0 m. Water quality data indicated most of these lakes are eutrophic, although many have productive fisheries and are considered recreational assets. Perception ratings of fishing quality and its associated water quality were obtained by distributing approximately 1200 surveys to Louisiana Bass Club Associaton members. The ability of Secchi disc transparency, total organic carbon, total Kjeldahl nitrogen, total phosphorus, and chlorophyll a to discriminate between perception classes was examined using probability distributions and multivariate analyses. Secchi disc and total organic carbon best reflected perceived lake conditions; however, these parameters did not provide the discrimination necessary for developing a quantitative risk assessment of lake trophic state. Consequently, an interim lakes index system was developed based on total organic carbon and perceived lake conditions. The developed index system will aid State officials in interpretating and evaluating regularly collected lake quality data, recognizing potential problem areas, and identifying proper management policies for protecting fisheries usage within the State.

  12. Spatial Distribution of a Large Herbivore Community at Waterholes: An Assessment of Its Stability over Years in Hwange National Park, Zimbabwe

    PubMed Central

    Chamaillé-Jammes, Simon; Charbonnel, Anaïs; Dray, Stéphane; Madzikanda, Hillary; Fritz, Hervé

    2016-01-01

    The spatial structuring of populations or communities is an important driver of their functioning and their influence on ecosystems. Identifying the (in)stability of the spatial structure of populations is a first step towards understanding the underlying causes of these structures. Here we studied the relative importance of spatial vs. interannual variability in explaining the patterns of abundance of a large herbivore community (8 species) at waterholes in Hwange National Park (Zimbabwe). We analyzed census data collected over 13 years using multivariate methods. Our results showed that variability in the census data was mostly explained by the spatial structure of the community, as some waterholes had consistently greater herbivore abundance than others. Some temporal variability probably linked to Park-scale migration dependent on annual rainfall was noticeable, however. Once this was accounted for, little temporal variability remained to be explained, suggesting that other factors affecting herbivore abundance over time had a negligible effect at the scale of the study. The extent of spatial and temporal variability in census data was also measured for each species. This study could help in projecting the consequences of surface water management, and more generally presents a methodological framework to simultaneously address the relative importance of spatial vs. temporal effects in driving the distribution of organisms across landscapes. PMID:27074044

  13. A surrogate-based sensitivity quantification and Bayesian inversion of a regional groundwater flow model

    NASA Astrophysics Data System (ADS)

    Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor

    2018-02-01

    Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.

  14. The Detection of Signals in Impulsive Noise.

    DTIC Science & Technology

    1983-06-01

    ASSI FICATION/ DOWN GRADING SCHEOUL1E * I1S. DISTRIBUTION STATEMENT (of th0i0 Rhport) Approved for Public Release; Distribucion Unlimited * 17...has a symmetric distribution, sgn(x i) will be -1 with probability 1/2 and +1 with probability 1/2. Considering the sum of observations as 0 binomial

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diwaker, E-mail: diwakerphysics@gmail.com; Chakraborty, Aniruddha

    The Smoluchowski equation with a time-dependent sink term is solved exactly. In this method, knowing the probability distribution P(0, s) at the origin, allows deriving the probability distribution P(x, s) at all positions. Exact solutions of the Smoluchowski equation are also provided in different cases where the sink term has linear, constant, inverse, and exponential variation in time.

  16. A least squares approach to estimating the probability distribution of unobserved data in multiphoton microscopy

    NASA Astrophysics Data System (ADS)

    Salama, Paul

    2008-02-01

    Multi-photon microscopy has provided biologists with unprecedented opportunities for high resolution imaging deep into tissues. Unfortunately deep tissue multi-photon microscopy images are in general noisy since they are acquired at low photon counts. To aid in the analysis and segmentation of such images it is sometimes necessary to initially enhance the acquired images. One way to enhance an image is to find the maximum a posteriori (MAP) estimate of each pixel comprising an image, which is achieved by finding a constrained least squares estimate of the unknown distribution. In arriving at the distribution it is assumed that the noise is Poisson distributed, the true but unknown pixel values assume a probability mass function over a finite set of non-negative values, and since the observed data also assumes finite values because of low photon counts, the sum of the probabilities of the observed pixel values (obtained from the histogram of the acquired pixel values) is less than one. Experimental results demonstrate that it is possible to closely estimate the unknown probability mass function with these assumptions.

  17. Probability distribution for the Gaussian curvature of the zero level surface of a random function

    NASA Astrophysics Data System (ADS)

    Hannay, J. H.

    2018-04-01

    A rather natural construction for a smooth random surface in space is the level surface of value zero, or ‘nodal’ surface f(x,y,z)  =  0, of a (real) random function f; the interface between positive and negative regions of the function. A physically significant local attribute at a point of a curved surface is its Gaussian curvature (the product of its principal curvatures) because, when integrated over the surface it gives the Euler characteristic. Here the probability distribution for the Gaussian curvature at a random point on the nodal surface f  =  0 is calculated for a statistically homogeneous (‘stationary’) and isotropic zero mean Gaussian random function f. Capitalizing on the isotropy, a ‘fixer’ device for axes supplies the probability distribution directly as a multiple integral. Its evaluation yields an explicit algebraic function with a simple average. Indeed, this average Gaussian curvature has long been known. For a non-zero level surface instead of the nodal one, the probability distribution is not fully tractable, but is supplied as an integral expression.

  18. A mechanism producing power law etc. distributions

    NASA Astrophysics Data System (ADS)

    Li, Heling; Shen, Hongjun; Yang, Bin

    2017-07-01

    Power law distribution is playing an increasingly important role in the complex system study. Based on the insolvability of complex systems, the idea of incomplete statistics is utilized and expanded, three different exponential factors are introduced in equations about the normalization condition, statistical average and Shannon entropy, with probability distribution function deduced about exponential function, power function and the product form between power function and exponential function derived from Shannon entropy and maximal entropy principle. So it is shown that maximum entropy principle can totally replace equal probability hypothesis. Owing to the fact that power and probability distribution in the product form between power function and exponential function, which cannot be derived via equal probability hypothesis, can be derived by the aid of maximal entropy principle, it also can be concluded that maximal entropy principle is a basic principle which embodies concepts more extensively and reveals basic principles on motion laws of objects more fundamentally. At the same time, this principle also reveals the intrinsic link between Nature and different objects in human society and principles complied by all.

  19. On Orbital Elements of Extrasolar Planetary Candidates and Spectroscopic Binaries

    NASA Technical Reports Server (NTRS)

    Stepinski, T. F.; Black, D. C.

    2001-01-01

    We estimate probability densities of orbital elements, periods, and eccentricities, for the population of extrasolar planetary candidates (EPC) and, separately, for the population of spectroscopic binaries (SB) with solar-type primaries. We construct empirical cumulative distribution functions (CDFs) in order to infer probability distribution functions (PDFs) for orbital periods and eccentricities. We also derive a joint probability density for period-eccentricity pairs in each population. Comparison of respective distributions reveals that in all cases EPC and SB populations are, in the context of orbital elements, indistinguishable from each other to a high degree of statistical significance. Probability densities of orbital periods in both populations have P(exp -1) functional form, whereas the PDFs of eccentricities can he best characterized as a Gaussian with a mean of about 0.35 and standard deviation of about 0.2 turning into a flat distribution at small values of eccentricity. These remarkable similarities between EPC and SB must be taken into account by theories aimed at explaining the origin of extrasolar planetary candidates, and constitute an important clue us to their ultimate nature.

  20. Steady-state distributions of probability fluxes on complex networks

    NASA Astrophysics Data System (ADS)

    Chełminiak, Przemysław; Kurzyński, Michał

    2017-02-01

    We consider a simple model of the Markovian stochastic dynamics on complex networks to examine the statistical properties of the probability fluxes. The additional transition, called hereafter a gate, powered by the external constant force breaks a detailed balance in the network. We argue, using a theoretical approach and numerical simulations, that the stationary distributions of the probability fluxes emergent under such conditions converge to the Gaussian distribution. By virtue of the stationary fluctuation theorem, its standard deviation depends directly on the square root of the mean flux. In turn, the nonlinear relation between the mean flux and the external force, which provides the key result of the present study, allows us to calculate the two parameters that entirely characterize the Gaussian distribution of the probability fluxes both close to as well as far from the equilibrium state. Also, the other effects that modify these parameters, such as the addition of shortcuts to the tree-like network, the extension and configuration of the gate and a change in the network size studied by means of computer simulations are widely discussed in terms of the rigorous theoretical predictions.

Top