Sample records for log-normal probability distribution

  1. Applying the log-normal distribution to target detection

    NASA Astrophysics Data System (ADS)

    Holst, Gerald C.

    1992-09-01

    Holst and Pickard experimentally determined that MRT responses tend to follow a log-normal distribution. The log normal distribution appeared reasonable because nearly all visual psychological data is plotted on a logarithmic scale. It has the additional advantage that it is bounded to positive values; an important consideration since probability of detection is often plotted in linear coordinates. Review of published data suggests that the log-normal distribution may have universal applicability. Specifically, the log-normal distribution obtained from MRT tests appears to fit the target transfer function and the probability of detection of rectangular targets.

  2. Estimating sales and sales market share from sales rank data for consumer appliances

    NASA Astrophysics Data System (ADS)

    Touzani, Samir; Van Buskirk, Robert

    2016-06-01

    Our motivation in this work is to find an adequate probability distribution to fit sales volumes of different appliances. This distribution allows for the translation of sales rank into sales volume. This paper shows that the log-normal distribution and specifically the truncated version are well suited for this purpose. We demonstrate that using sales proxies derived from a calibrated truncated log-normal distribution function can be used to produce realistic estimates of market average product prices, and product attributes. We show that the market averages calculated with the sales proxies derived from the calibrated, truncated log-normal distribution provide better market average estimates than sales proxies estimated with simpler distribution functions.

  3. Probability density function of non-reactive solute concentration in heterogeneous porous formations.

    PubMed

    Bellin, Alberto; Tonina, Daniele

    2007-10-30

    Available models of solute transport in heterogeneous formations lack in providing complete characterization of the predicted concentration. This is a serious drawback especially in risk analysis where confidence intervals and probability of exceeding threshold values are required. Our contribution to fill this gap of knowledge is a probability distribution model for the local concentration of conservative tracers migrating in heterogeneous aquifers. Our model accounts for dilution, mechanical mixing within the sampling volume and spreading due to formation heterogeneity. It is developed by modeling local concentration dynamics with an Ito Stochastic Differential Equation (SDE) that under the hypothesis of statistical stationarity leads to the Beta probability distribution function (pdf) for the solute concentration. This model shows large flexibility in capturing the smoothing effect of the sampling volume and the associated reduction of the probability of exceeding large concentrations. Furthermore, it is fully characterized by the first two moments of the solute concentration, and these are the same pieces of information required for standard geostatistical techniques employing Normal or Log-Normal distributions. Additionally, we show that in the absence of pore-scale dispersion and for point concentrations the pdf model converges to the binary distribution of [Dagan, G., 1982. Stochastic modeling of groundwater flow by unconditional and conditional probabilities, 2, The solute transport. Water Resour. Res. 18 (4), 835-848.], while it approaches the Normal distribution for sampling volumes much larger than the characteristic scale of the aquifer heterogeneity. Furthermore, we demonstrate that the same model with the spatial moments replacing the statistical moments can be applied to estimate the proportion of the plume volume where solute concentrations are above or below critical thresholds. Application of this model to point and vertically averaged bromide concentrations from the first Cape Cod tracer test and to a set of numerical simulations confirms the above findings and for the first time it shows the superiority of the Beta model to both Normal and Log-Normal models in interpreting field data. Furthermore, we show that assuming a-priori that local concentrations are normally or log-normally distributed may result in a severe underestimate of the probability of exceeding large concentrations.

  4. Standardized likelihood ratio test for comparing several log-normal means and confidence interval for the common mean.

    PubMed

    Krishnamoorthy, K; Oral, Evrim

    2017-12-01

    Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.

  5. Size distribution of submarine landslides along the U.S. Atlantic margin

    USGS Publications Warehouse

    Chaytor, J.D.; ten Brink, Uri S.; Solow, A.R.; Andrews, B.D.

    2009-01-01

    Assessment of the probability for destructive landslide-generated tsunamis depends on the knowledge of the number, size, and frequency of large submarine landslides. This paper investigates the size distribution of submarine landslides along the U.S. Atlantic continental slope and rise using the size of the landslide source regions (landslide failure scars). Landslide scars along the margin identified in a detailed bathymetric Digital Elevation Model (DEM) have areas that range between 0.89??km2 and 2410??km2 and volumes between 0.002??km3 and 179??km3. The area to volume relationship of these failure scars is almost linear (inverse power-law exponent close to 1), suggesting a fairly uniform failure thickness of a few 10s of meters in each event, with only rare, deep excavating landslides. The cumulative volume distribution of the failure scars is very well described by a log-normal distribution rather than by an inverse power-law, the most commonly used distribution for both subaerial and submarine landslides. A log-normal distribution centered on a volume of 0.86??km3 may indicate that landslides preferentially mobilize a moderate amount of material (on the order of 1??km3), rather than large landslides or very small ones. Alternatively, the log-normal distribution may reflect an inverse power law distribution modified by a size-dependent probability of observing landslide scars in the bathymetry data. If the latter is the case, an inverse power-law distribution with an exponent of 1.3 ?? 0.3, modified by a size-dependent conditional probability of identifying more failure scars with increasing landslide size, fits the observed size distribution. This exponent value is similar to the predicted exponent of 1.2 ?? 0.3 for subaerial landslides in unconsolidated material. Both the log-normal and modified inverse power-law distributions of the observed failure scar volumes suggest that large landslides, which have the greatest potential to generate damaging tsunamis, occur infrequently along the margin. ?? 2008 Elsevier B.V.

  6. Statistical Characterization of the Mechanical Parameters of Intact Rock Under Triaxial Compression: An Experimental Proof of the Jinping Marble

    NASA Astrophysics Data System (ADS)

    Jiang, Quan; Zhong, Shan; Cui, Jie; Feng, Xia-Ting; Song, Leibo

    2016-12-01

    We investigated the statistical characteristics and probability distribution of the mechanical parameters of natural rock using triaxial compression tests. Twenty cores of Jinping marble were tested under each different levels of confining stress (i.e., 5, 10, 20, 30, and 40 MPa). From these full stress-strain data, we summarized the numerical characteristics and determined the probability distribution form of several important mechanical parameters, including deformational parameters, characteristic strength, characteristic strains, and failure angle. The statistical proofs relating to the mechanical parameters of rock presented new information about the marble's probabilistic distribution characteristics. The normal and log-normal distributions were appropriate for describing random strengths of rock; the coefficients of variation of the peak strengths had no relationship to the confining stress; the only acceptable random distribution for both Young's elastic modulus and Poisson's ratio was the log-normal function; and the cohesive strength had a different probability distribution pattern than the frictional angle. The triaxial tests and statistical analysis also provided experimental evidence for deciding the minimum reliable number of experimental sample and for picking appropriate parameter distributions to use in reliability calculations for rock engineering.

  7. Energetics and Birth Rates of Supernova Remnants in the Large Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Leahy, D. A.

    2017-03-01

    Published X-ray emission properties for a sample of 50 supernova remnants (SNRs) in the Large Magellanic Cloud (LMC) are used as input for SNR evolution modeling calculations. The forward shock emission is modeled to obtain the initial explosion energy, age, and circumstellar medium density for each SNR in the sample. The resulting age distribution yields a SNR birthrate of 1/(500 yr) for the LMC. The explosion energy distribution is well fit by a log-normal distribution, with a most-probable explosion energy of 0.5× {10}51 erg, with a 1σ dispersion by a factor of 3 in energy. The circumstellar medium density distribution is broader than the explosion energy distribution, with a most-probable density of ˜0.1 cm-3. The shape of the density distribution can be fit with a log-normal distribution, with incompleteness at high density caused by the shorter evolution times of SNRs.

  8. Shape of growth-rate distribution determines the type of Non-Gibrat’s Property

    NASA Astrophysics Data System (ADS)

    Ishikawa, Atushi; Fujimoto, Shouji; Mizuno, Takayuki

    2011-11-01

    In this study, the authors examine exhaustive business data on Japanese firms, which cover nearly all companies in the mid- and large-scale ranges in terms of firm size, to reach several key findings on profits/sales distribution and business growth trends. Here, profits denote net profits. First, detailed balance is observed not only in profits data but also in sales data. Furthermore, the growth-rate distribution of sales has wider tails than the linear growth-rate distribution of profits in log-log scale. On the one hand, in the mid-scale range of profits, the probability of positive growth decreases and the probability of negative growth increases symmetrically as the initial value increases. This is called Non-Gibrat’s First Property. On the other hand, in the mid-scale range of sales, the probability of positive growth decreases as the initial value increases, while the probability of negative growth hardly changes. This is called Non-Gibrat’s Second Property. Under detailed balance, Non-Gibrat’s First and Second Properties are analytically derived from the linear and quadratic growth-rate distributions in log-log scale, respectively. In both cases, the log-normal distribution is inferred from Non-Gibrat’s Properties and detailed balance. These analytic results are verified by empirical data. Consequently, this clarifies the notion that the difference in shapes between growth-rate distributions of sales and profits is closely related to the difference between the two Non-Gibrat’s Properties in the mid-scale range.

  9. Ordinal probability effect measures for group comparisons in multinomial cumulative link models.

    PubMed

    Agresti, Alan; Kateri, Maria

    2017-03-01

    We consider simple ordinal model-based probability effect measures for comparing distributions of two groups, adjusted for explanatory variables. An "ordinal superiority" measure summarizes the probability that an observation from one distribution falls above an independent observation from the other distribution, adjusted for explanatory variables in a model. The measure applies directly to normal linear models and to a normal latent variable model for ordinal response variables. It equals Φ(β/2) for the corresponding ordinal model that applies a probit link function to cumulative multinomial probabilities, for standard normal cdf Φ and effect β that is the coefficient of the group indicator variable. For the more general latent variable model for ordinal responses that corresponds to a linear model with other possible error distributions and corresponding link functions for cumulative multinomial probabilities, the ordinal superiority measure equals exp(β)/[1+exp(β)] with the log-log link and equals approximately exp(β/2)/[1+exp(β/2)] with the logit link, where β is the group effect. Another ordinal superiority measure generalizes the difference of proportions from binary to ordinal responses. We also present related measures directly for ordinal models for the observed response that need not assume corresponding latent response models. We present confidence intervals for the measures and illustrate with an example. © 2016, The International Biometric Society.

  10. Probability distribution functions for unit hydrographs with optimization using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ghorbani, Mohammad Ali; Singh, Vijay P.; Sivakumar, Bellie; H. Kashani, Mahsa; Atre, Atul Arvind; Asadi, Hakimeh

    2017-05-01

    A unit hydrograph (UH) of a watershed may be viewed as the unit pulse response function of a linear system. In recent years, the use of probability distribution functions (pdfs) for determining a UH has received much attention. In this study, a nonlinear optimization model is developed to transmute a UH into a pdf. The potential of six popular pdfs, namely two-parameter gamma, two-parameter Gumbel, two-parameter log-normal, two-parameter normal, three-parameter Pearson distribution, and two-parameter Weibull is tested on data from the Lighvan catchment in Iran. The probability distribution parameters are determined using the nonlinear least squares optimization method in two ways: (1) optimization by programming in Mathematica; and (2) optimization by applying genetic algorithm. The results are compared with those obtained by the traditional linear least squares method. The results show comparable capability and performance of two nonlinear methods. The gamma and Pearson distributions are the most successful models in preserving the rising and recession limbs of the unit hydographs. The log-normal distribution has a high ability in predicting both the peak flow and time to peak of the unit hydrograph. The nonlinear optimization method does not outperform the linear least squares method in determining the UH (especially for excess rainfall of one pulse), but is comparable.

  11. Transient Properties of Probability Distribution for a Markov Process with Size-dependent Additive Noise

    NASA Astrophysics Data System (ADS)

    Yamada, Yuhei; Yamazaki, Yoshihiro

    2018-04-01

    This study considered a stochastic model for cluster growth in a Markov process with a cluster size dependent additive noise. According to this model, the probability distribution of the cluster size transiently becomes an exponential or a log-normal distribution depending on the initial condition of the growth. In this letter, a master equation is obtained for this model, and derivation of the distributions is discussed.

  12. Stick-slip behavior in a continuum-granular experiment.

    PubMed

    Geller, Drew A; Ecke, Robert E; Dahmen, Karin A; Backhaus, Scott

    2015-12-01

    We report moment distribution results from a laboratory experiment, similar in character to an isolated strike-slip earthquake fault, consisting of sheared elastic plates separated by a narrow gap filled with a two-dimensional granular medium. Local measurement of strain displacements of the plates at 203 spatial points located adjacent to the gap allows direct determination of the event moments and their spatial and temporal distributions. We show that events consist of spatially coherent, larger motions and spatially extended (noncoherent), smaller events. The noncoherent events have a probability distribution of event moment consistent with an M(-3/2) power law scaling with Poisson-distributed recurrence times. Coherent events have a log-normal moment distribution and mean temporal recurrence. As the applied normal pressure increases, there are more coherent events and their log-normal distribution broadens and shifts to larger average moment.

  13. Stochastic Growth Theory of Spatially-Averaged Distributions of Langmuir Fields in Earth's Foreshock

    NASA Technical Reports Server (NTRS)

    Boshuizen, Christopher R.; Cairns, Iver H.; Robinson, P. A.

    2001-01-01

    Langmuir-like waves in the foreshock of Earth are characteristically bursty and irregular, and are the subject of a number of recent studies. Averaged over the foreshock, it is observed that the probability distribution is power-law P(bar)(log E) in the wave field E with the bar denoting this averaging over position, In this paper it is shown that stochastic growth theory (SGT) can explain a power-law spatially-averaged distributions P(bar)(log E), when the observed power-law variations of the mean and standard deviation of log E with position are combined with the log normal statistics predicted by SGT at each location.

  14. [Distribution of individuals by spontaneous frequencies of lymphocytes with micronuclei. Particularity and consequences].

    PubMed

    Serebrianyĭ, A M; Akleev, A V; Aleshchenko, A V; Antoshchina, M M; Kudriashova, O V; Riabchenko, N I; Semenova, L P; Pelevina, I I

    2011-01-01

    By micronucleus (MN) assay with cytokinetic cytochalasin B block, the mean frequency of blood lymphocytes with MN has been determined in 76 Moscow inhabitants, 35 people from Obninsk and 122 from Chelyabinsk region. In contrast to the distribution of individuals on spontaneous frequency of cells with aberrations, which was shown to be binomial (Kusnetzov et al., 1980), the distribution of individuals on the spontaneous frequency of cells with MN in all three massif can be acknowledged as log-normal (chi2 test). Distribution of individuals in the joined massifs (Moscow and Obninsk inhabitants) and in the unique massif of all inspected with great reliability must be acknowledged as log-normal (0.70 and 0.86 correspondingly), but it cannot be regarded as Poisson, binomial or normal. Taking into account that log-normal distribution of children by spontaneous frequency of lymphocytes with MN has been observed by the inspection of 473 children from different kindergartens in Moscow we can make the conclusion that log-normal is regularity inherent in this type of damage of lymphocytes genome. On the contrary the distribution of individuals on induced by irradiation in vitro lymphocytes with MN frequency in most cases must be acknowledged as normal. This distribution character points out that damage appearance in the individual (genomic instability) in a single lymphocytes increases the probability of the damage appearance in another lymphocytes. We can propose that damaged stem cells lymphocyte progenitor's exchange by information with undamaged cells--the type of the bystander effect process. It can also be supposed that transmission of damage to daughter cells occurs in the time of stem cells division.

  15. Computer routines for probability distributions, random numbers, and related functions

    USGS Publications Warehouse

    Kirby, W.

    1983-01-01

    Use of previously coded and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main progress. The probability distributions provided include the beta, chi-square, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F. Other mathematical functions include the Bessel function, I sub o, gamma and log-gamma functions, error functions, and exponential integral. Auxiliary services include sorting and printer-plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)

  16. Computer routines for probability distributions, random numbers, and related functions

    USGS Publications Warehouse

    Kirby, W.H.

    1980-01-01

    Use of previously codes and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main programs. The probability distributions provided include the beta, chisquare, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F tests. Other mathematical functions include the Bessel function I (subzero), gamma and log-gamma functions, error functions and exponential integral. Auxiliary services include sorting and printer plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)

  17. The stochastic distribution of available coefficient of friction for human locomotion of five different floor surfaces.

    PubMed

    Chang, Wen-Ruey; Matz, Simon; Chang, Chien-Chi

    2014-05-01

    The maximum coefficient of friction that can be supported at the shoe and floor interface without a slip is usually called the available coefficient of friction (ACOF) for human locomotion. The probability of a slip could be estimated using a statistical model by comparing the ACOF with the required coefficient of friction (RCOF), assuming that both coefficients have stochastic distributions. An investigation of the stochastic distributions of the ACOF of five different floor surfaces under dry, water and glycerol conditions is presented in this paper. One hundred friction measurements were performed on each floor surface under each surface condition. The Kolmogorov-Smirnov goodness-of-fit test was used to determine if the distribution of the ACOF was a good fit with the normal, log-normal and Weibull distributions. The results indicated that the ACOF distributions had a slightly better match with the normal and log-normal distributions than with the Weibull in only three out of 15 cases with a statistical significance. The results are far more complex than what had heretofore been published and different scenarios could emerge. Since the ACOF is compared with the RCOF for the estimate of slip probability, the distribution of the ACOF in seven cases could be considered a constant for this purpose when the ACOF is much lower or higher than the RCOF. A few cases could be represented by a normal distribution for practical reasons based on their skewness and kurtosis values without a statistical significance. No representation could be found in three cases out of 15. Copyright © 2013 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  18. Probabilistic properties of wavelets in kinetic surface roughening

    NASA Astrophysics Data System (ADS)

    Bershadskii, A.

    2001-08-01

    Using the data of a recent numerical simulation [M. Ahr and M. Biehl, Phys. Rev. E 62, 1773 (2000)] of homoepitaxial growth it is shown that the observed probability distribution of a wavelet based measure of the growing surface roughness is consistent with a stretched log-normal distribution and the corresponding branching dimension depends on the level of particle desorption.

  19. Log-Normality and Multifractal Analysis of Flame Surface Statistics

    NASA Astrophysics Data System (ADS)

    Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.

    2013-11-01

    The turbulent flame surface is typically highly wrinkled and folded at a multitude of scales controlled by various flame properties. It is useful if the information contained in this complex geometry can be projected onto a simpler regular geometry for the use of spectral, wavelet or multifractal analyses. Here we investigate local flame surface statistics of turbulent flame expanding under constant pressure. First the statistics of local length ratio is experimentally obtained from high-speed Mie scattering images. For spherically expanding flame, length ratio on the measurement plane, at predefined equiangular sectors is defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at corresponding area-ratio pdfs. Both the pdfs are found to be near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis. Currently at Indian Institute of Science, India.

  20. Polynomial probability distribution estimation using the method of moments

    PubMed Central

    Mattsson, Lars; Rydén, Jesper

    2017-01-01

    We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram–Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation. PMID:28394949

  1. Polynomial probability distribution estimation using the method of moments.

    PubMed

    Munkhammar, Joakim; Mattsson, Lars; Rydén, Jesper

    2017-01-01

    We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram-Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation.

  2. M-dwarf exoplanet surface density distribution. A log-normal fit from 0.07 to 400 AU

    NASA Astrophysics Data System (ADS)

    Meyer, Michael R.; Amara, Adam; Reggiani, Maddalena; Quanz, Sascha P.

    2018-04-01

    Aims: We fit a log-normal function to the M-dwarf orbital surface density distribution of gas giant planets, over the mass range 1-10 times that of Jupiter, from 0.07 to 400 AU. Methods: We used a Markov chain Monte Carlo approach to explore the likelihoods of various parameter values consistent with point estimates of the data given our assumed functional form. Results: This fit is consistent with radial velocity, microlensing, and direct-imaging observations, is well-motivated from theoretical and phenomenological points of view, and predicts results of future surveys. We present probability distributions for each parameter and a maximum likelihood estimate solution. Conclusions: We suggest that this function makes more physical sense than other widely used functions, and we explore the implications of our results on the design of future exoplanet surveys.

  3. Flame surface statistics of constant-pressure turbulent expanding premixed flames

    NASA Astrophysics Data System (ADS)

    Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.

    2014-04-01

    In this paper we investigate the local flame surface statistics of constant-pressure turbulent expanding flames. First the statistics of local length ratio is experimentally determined from high-speed planar Mie scattering images of spherically expanding flames, with the length ratio on the measurement plane, at predefined equiangular sectors, defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we then convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at the corresponding area-ratio pdfs. It is found that both the length ratio and area ratio pdfs are near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis.

  4. Observed, unknown distributions of clinical chemical quantities should be considered to be log-normal: a proposal.

    PubMed

    Haeckel, Rainer; Wosniok, Werner

    2010-10-01

    The distribution of many quantities in laboratory medicine are considered to be Gaussian if they are symmetric, although, theoretically, a Gaussian distribution is not plausible for quantities that can attain only non-negative values. If a distribution is skewed, further specification of the type is required, which may be difficult to provide. Skewed (non-Gaussian) distributions found in clinical chemistry usually show only moderately large positive skewness (e.g., log-normal- and χ(2) distribution). The degree of skewness depends on the magnitude of the empirical biological variation (CV(e)), as demonstrated using the log-normal distribution. A Gaussian distribution with a small CV(e) (e.g., for plasma sodium) is very similar to a log-normal distribution with the same CV(e). In contrast, a relatively large CV(e) (e.g., plasma aspartate aminotransferase) leads to distinct differences between a Gaussian and a log-normal distribution. If the type of an empirical distribution is unknown, it is proposed that a log-normal distribution be assumed in such cases. This avoids distributional assumptions that are not plausible and does not contradict the observation that distributions with small biological variation look very similar to a Gaussian distribution.

  5. On the generation of log-Lévy distributions and extreme randomness

    NASA Astrophysics Data System (ADS)

    Eliazar, Iddo; Klafter, Joseph

    2011-10-01

    The log-normal distribution is prevalent across the sciences, as it emerges from the combination of multiplicative processes and the central limit theorem (CLT). The CLT, beyond yielding the normal distribution, also yields the class of Lévy distributions. The log-Lévy distributions are the Lévy counterparts of the log-normal distribution, they appear in the context of ultraslow diffusion processes, and they are categorized by Mandelbrot as belonging to the class of extreme randomness. In this paper, we present a natural stochastic growth model from which both the log-normal distribution and the log-Lévy distributions emerge universally—the former in the case of deterministic underlying setting, and the latter in the case of stochastic underlying setting. In particular, we establish a stochastic growth model which universally generates Mandelbrot’s extreme randomness.

  6. Simulation of flight maneuver-load distributions by utilizing stationary, non-Gaussian random load histories

    NASA Technical Reports Server (NTRS)

    Leybold, H. A.

    1971-01-01

    Random numbers were generated with the aid of a digital computer and transformed such that the probability density function of a discrete random load history composed of these random numbers had one of the following non-Gaussian distributions: Poisson, binomial, log-normal, Weibull, and exponential. The resulting random load histories were analyzed to determine their peak statistics and were compared with cumulative peak maneuver-load distributions for fighter and transport aircraft in flight.

  7. Gradually truncated log-normal in USA publicly traded firm size distribution

    NASA Astrophysics Data System (ADS)

    Gupta, Hari M.; Campanha, José R.; de Aguiar, Daniela R.; Queiroz, Gabriel A.; Raheja, Charu G.

    2007-03-01

    We study the statistical distribution of firm size for USA and Brazilian publicly traded firms through the Zipf plot technique. Sale size is used to measure firm size. The Brazilian firm size distribution is given by a log-normal distribution without any adjustable parameter. However, we also need to consider different parameters of log-normal distribution for the largest firms in the distribution, which are mostly foreign firms. The log-normal distribution has to be gradually truncated after a certain critical value for USA firms. Therefore, the original hypothesis of proportional effect proposed by Gibrat is valid with some modification for very large firms. We also consider the possible mechanisms behind this distribution.

  8. Universal noise and Efimov physics

    NASA Astrophysics Data System (ADS)

    Nicholson, Amy N.

    2016-03-01

    Probability distributions for correlation functions of particles interacting via random-valued fields are discussed as a novel tool for determining the spectrum of a theory. In particular, this method is used to determine the energies of universal N-body clusters tied to Efimov trimers, for even N, by investigating the distribution of a correlation function of two particles at unitarity. Using numerical evidence that this distribution is log-normal, an analytical prediction for the N-dependence of the N-body binding energies is made.

  9. Log-normal distribution from a process that is not multiplicative but is additive.

    PubMed

    Mouri, Hideaki

    2013-10-01

    The central limit theorem ensures that a sum of random variables tends to a Gaussian distribution as their total number tends to infinity. However, for a class of positive random variables, we find that the sum tends faster to a log-normal distribution. Although the sum tends eventually to a Gaussian distribution, the distribution of the sum is always close to a log-normal distribution rather than to any Gaussian distribution if the summands are numerous enough. This is in contrast to the current consensus that any log-normal distribution is due to a product of random variables, i.e., a multiplicative process, or equivalently to nonlinearity of the system. In fact, the log-normal distribution is also observable for a sum, i.e., an additive process that is typical of linear systems. We show conditions for such a sum, an analytical example, and an application to random scalar fields such as those of turbulence.

  10. Estimation of Microbial Contamination of Food from Prevalence and Concentration Data: Application to Listeria monocytogenes in Fresh Vegetables▿

    PubMed Central

    Crépet, Amélie; Albert, Isabelle; Dervin, Catherine; Carlin, Frédéric

    2007-01-01

    A normal distribution and a mixture model of two normal distributions in a Bayesian approach using prevalence and concentration data were used to establish the distribution of contamination of the food-borne pathogenic bacteria Listeria monocytogenes in unprocessed and minimally processed fresh vegetables. A total of 165 prevalence studies, including 15 studies with concentration data, were taken from the scientific literature and from technical reports and used for statistical analysis. The predicted mean of the normal distribution of the logarithms of viable L. monocytogenes per gram of fresh vegetables was −2.63 log viable L. monocytogenes organisms/g, and its standard deviation was 1.48 log viable L. monocytogenes organisms/g. These values were determined by considering one contaminated sample in prevalence studies in which samples are in fact negative. This deliberate overestimation is necessary to complete calculations. With the mixture model, the predicted mean of the distribution of the logarithm of viable L. monocytogenes per gram of fresh vegetables was −3.38 log viable L. monocytogenes organisms/g and its standard deviation was 1.46 log viable L. monocytogenes organisms/g. The probabilities of fresh unprocessed and minimally processed vegetables being contaminated with concentrations higher than 1, 2, and 3 log viable L. monocytogenes organisms/g were 1.44, 0.63, and 0.17%, respectively. Introducing a sensitivity rate of 80 or 95% in the mixture model had a small effect on the estimation of the contamination. In contrast, introducing a low sensitivity rate (40%) resulted in marked differences, especially for high percentiles. There was a significantly lower estimation of contamination in the papers and reports of 2000 to 2005 than in those of 1988 to 1999 and a lower estimation of contamination of leafy salads than that of sprouts and other vegetables. The interest of the mixture model for the estimation of microbial contamination is discussed. PMID:17098926

  11. Superstatistical generalised Langevin equation: non-Gaussian viscoelastic anomalous diffusion

    NASA Astrophysics Data System (ADS)

    Ślęzak, Jakub; Metzler, Ralf; Magdziarz, Marcin

    2018-02-01

    Recent advances in single particle tracking and supercomputing techniques demonstrate the emergence of normal or anomalous, viscoelastic diffusion in conjunction with non-Gaussian distributions in soft, biological, and active matter systems. We here formulate a stochastic model based on a generalised Langevin equation in which non-Gaussian shapes of the probability density function and normal or anomalous diffusion have a common origin, namely a random parametrisation of the stochastic force. We perform a detailed analysis demonstrating how various types of parameter distributions for the memory kernel result in exponential, power law, or power-log law tails of the memory functions. The studied system is also shown to exhibit a further unusual property: the velocity has a Gaussian one point probability density but non-Gaussian joint distributions. This behaviour is reflected in the relaxation from a Gaussian to a non-Gaussian distribution observed for the position variable. We show that our theoretical results are in excellent agreement with stochastic simulations.

  12. Log Normal Distribution of Cellular Uptake of Radioactivity: Statistical Analysis of Alpha Particle Track Autoradiography

    PubMed Central

    Neti, Prasad V.S.V.; Howell, Roger W.

    2008-01-01

    Recently, the distribution of radioactivity among a population of cells labeled with 210Po was shown to be well described by a log normal distribution function (J Nucl Med 47, 6 (2006) 1049-1058) with the aid of an autoradiographic approach. To ascertain the influence of Poisson statistics on the interpretation of the autoradiographic data, the present work reports on a detailed statistical analyses of these data. Methods The measured distributions of alpha particle tracks per cell were subjected to statistical tests with Poisson (P), log normal (LN), and Poisson – log normal (P – LN) models. Results The LN distribution function best describes the distribution of radioactivity among cell populations exposed to 0.52 and 3.8 kBq/mL 210Po-citrate. When cells were exposed to 67 kBq/mL, the P – LN distribution function gave a better fit, however, the underlying activity distribution remained log normal. Conclusions The present analysis generally provides further support for the use of LN distributions to describe the cellular uptake of radioactivity. Care should be exercised when analyzing autoradiographic data on activity distributions to ensure that Poisson processes do not distort the underlying LN distribution. PMID:16741316

  13. The probability distribution model of air pollution index and its dominants in Kuala Lumpur

    NASA Astrophysics Data System (ADS)

    AL-Dhurafi, Nasr Ahmed; Razali, Ahmad Mahir; Masseran, Nurulkamal; Zamzuri, Zamira Hasanah

    2016-11-01

    This paper focuses on the statistical modeling for the distributions of air pollution index (API) and its sub-indexes data observed at Kuala Lumpur in Malaysia. Five pollutants or sub-indexes are measured including, carbon monoxide (CO); sulphur dioxide (SO2); nitrogen dioxide (NO2), and; particulate matter (PM10). Four probability distributions are considered, namely log-normal, exponential, Gamma and Weibull in search for the best fit distribution to the Malaysian air pollutants data. In order to determine the best distribution for describing the air pollutants data, five goodness-of-fit criteria's are applied. This will help in minimizing the uncertainty in pollution resource estimates and improving the assessment phase of planning. The conflict in criterion results for selecting the best distribution was overcome by using the weight of ranks method. We found that the Gamma distribution is the best distribution for the majority of air pollutants data in Kuala Lumpur.

  14. Generalised Extreme Value Distributions Provide a Natural Hypothesis for the Shape of Seed Mass Distributions

    PubMed Central

    2015-01-01

    Among co-occurring species, values for functionally important plant traits span orders of magnitude, are uni-modal, and generally positively skewed. Such data are usually log-transformed “for normality” but no convincing mechanistic explanation for a log-normal expectation exists. Here we propose a hypothesis for the distribution of seed masses based on generalised extreme value distributions (GEVs), a class of probability distributions used in climatology to characterise the impact of event magnitudes and frequencies; events that impose strong directional selection on biological traits. In tests involving datasets from 34 locations across the globe, GEVs described log10 seed mass distributions as well or better than conventional normalising statistics in 79% of cases, and revealed a systematic tendency for an overabundance of small seed sizes associated with low latitudes. GEVs characterise disturbance events experienced in a location to which individual species’ life histories could respond, providing a natural, biological explanation for trait expression that is lacking from all previous hypotheses attempting to describe trait distributions in multispecies assemblages. We suggest that GEVs could provide a mechanistic explanation for plant trait distributions and potentially link biology and climatology under a single paradigm. PMID:25830773

  15. Proton Straggling in Thick Silicon Detectors

    NASA Technical Reports Server (NTRS)

    Selesnick, R. S.; Baker, D. N.; Kanekal, S. G.

    2017-01-01

    Straggling functions for protons in thick silicon radiation detectors are computed by Monte Carlo simulation. Mean energy loss is constrained by the silicon stopping power, providing higher straggling at low energy and probabilities for stopping within the detector volume. By matching the first four moments of simulated energy-loss distributions, straggling functions are approximated by a log-normal distribution that is accurate for Vavilov k is greater than or equal to 0:3. They are verified by comparison to experimental proton data from a charged particle telescope.

  16. A Bayesian Surrogate for Regional Skew in Flood Frequency Analysis

    NASA Astrophysics Data System (ADS)

    Kuczera, George

    1983-06-01

    The problem of how to best utilize site and regional flood data to infer the shape parameter of a flood distribution is considered. One approach to this problem is given in Bulletin 17B of the U.S. Water Resources Council (1981) for the log-Pearson distribution. Here a lesser known distribution is considered, namely, the power normal which fits flood data as well as the log-Pearson and has a shape parameter denoted by λ derived from a Box-Cox power transformation. The problem of regionalizing λ is considered from an empirical Bayes perspective where site and regional flood data are used to infer λ. The distortive effects of spatial correlation and heterogeneity of site sampling variance of λ are explicitly studied with spatial correlation being found to be of secondary importance. The end product of this analysis is the posterior distribution of the power normal parameters expressing, in probabilistic terms, what is known about the parameters given site flood data and regional information on λ. This distribution can be used to provide the designer with several types of information. The posterior distribution of the T-year flood is derived. The effect of nonlinearity in λ on inference is illustrated. Because uncertainty in λ is explicitly allowed for, the understatement in confidence limits due to fixing λ (analogous to fixing log skew) is avoided. Finally, it is shown how to obtain the marginal flood distribution which can be used to select a design flood with specified exceedance probability.

  17. Stylized facts in internal rates of return on stock index and its derivative transactions

    NASA Astrophysics Data System (ADS)

    Pichl, Lukáš; Kaizoji, Taisei; Yamano, Takuya

    2007-08-01

    Universal features in stock markets and their derivative markets are studied by means of probability distributions in internal rates of return on buy and sell transaction pairs. Unlike the stylized facts in normalized log returns, the probability distributions for such single asset encounters incorporate the time factor by means of the internal rate of return, defined as the continuous compound interest. Resulting stylized facts are shown in the probability distributions derived from the daily series of TOPIX, S & P 500 and FTSE 100 index close values. The application of the above analysis to minute-tick data of NIKKEI 225 and its futures market, respectively, reveals an interesting difference in the behavior of the two probability distributions, in case a threshold on the minimal duration of the long position is imposed. It is therefore suggested that the probability distributions of the internal rates of return could be used for causality mining between the underlying and derivative stock markets. The highly specific discrete spectrum, which results from noise trader strategies as opposed to the smooth distributions observed for fundamentalist strategies in single encounter transactions may be useful in deducing the type of investment strategy from trading revenues of small portfolio investors.

  18. Detection of anomalous events

    DOEpatents

    Ferragut, Erik M.; Laska, Jason A.; Bridges, Robert A.

    2016-06-07

    A system is described for receiving a stream of events and scoring the events based on anomalousness and maliciousness (or other classification). The system can include a plurality of anomaly detectors that together implement an algorithm to identify low-probability events and detect atypical traffic patterns. The anomaly detector provides for comparability of disparate sources of data (e.g., network flow data and firewall logs.) Additionally, the anomaly detector allows for regulatability, meaning that the algorithm can be user configurable to adjust a number of false alerts. The anomaly detector can be used for a variety of probability density functions, including normal Gaussian distributions, irregular distributions, as well as functions associated with continuous or discrete variables.

  19. Diameter distribution in a Brazilian tropical dry forest domain: predictions for the stand and species.

    PubMed

    Lima, Robson B DE; Bufalino, Lina; Alves, Francisco T; Silva, José A A DA; Ferreira, Rinaldo L C

    2017-01-01

    Currently, there is a lack of studies on the correct utilization of continuous distributions for dry tropical forests. Therefore, this work aims to investigate the diameter structure of a brazilian tropical dry forest and to select suitable continuous distributions by means of statistic tools for the stand and the main species. Two subsets were randomly selected from 40 plots. Diameter at base height was obtained. The following functions were tested: log-normal; gamma; Weibull 2P and Burr. The best fits were selected by Akaike's information validation criterion. Overall, the diameter distribution of the dry tropical forest was better described by negative exponential curves and positive skewness. The forest studied showed diameter distributions with decreasing probability for larger trees. This behavior was observed for both the main species and the stand. The generalization of the function fitted for the main species show that the development of individual models is needed. The Burr function showed good flexibility to describe the diameter structure of the stand and the behavior of Mimosa ophthalmocentra and Bauhinia cheilantha species. For Poincianella bracteosa, Aspidosperma pyrifolium and Myracrodum urundeuva better fitting was obtained with the log-normal function.

  20. Apparent Transition in the Human Height Distribution Caused by Age-Dependent Variation during Puberty Period

    NASA Astrophysics Data System (ADS)

    Iwata, Takaki; Yamazaki, Yoshihiro; Kuninaka, Hiroto

    2013-08-01

    In this study, we examine the validity of the transition of the human height distribution from the log-normal distribution to the normal distribution during puberty, as suggested in an earlier study [Kuninaka et al.: J. Phys. Soc. Jpn. 78 (2009) 125001]. Our data analysis reveals that, in late puberty, the variation in height decreases as children grow. Thus, the classification of a height dataset by age at this stage leads us to analyze a mixture of distributions with larger means and smaller variations. This mixture distribution has a negative skewness and is consequently closer to the normal distribution than to the log-normal distribution. The opposite case occurs in early puberty and the mixture distribution is positively skewed, which resembles the log-normal distribution rather than the normal distribution. Thus, this scenario mimics the transition during puberty. Additionally, our scenario is realized through a numerical simulation based on a statistical model. The present study does not support the transition suggested by the earlier study.

  1. Checking distributional assumptions for pharmacokinetic summary statistics based on simulations with compartmental models.

    PubMed

    Shen, Meiyu; Russek-Cohen, Estelle; Slud, Eric V

    2016-08-12

    Bioequivalence (BE) studies are an essential part of the evaluation of generic drugs. The most common in vivo BE study design is the two-period two-treatment crossover design. AUC (area under the concentration-time curve) and Cmax (maximum concentration) are obtained from the observed concentration-time profiles for each subject from each treatment under each sequence. In the BE evaluation of pharmacokinetic crossover studies, the normality of the univariate response variable, e.g. log(AUC) 1 or log(Cmax), is often assumed in the literature without much evidence. Therefore, we investigate the distributional assumption of the normality of response variables, log(AUC) and log(Cmax), by simulating concentration-time profiles from two-stage pharmacokinetic models (commonly used in pharmacokinetic research) for a wide range of pharmacokinetic parameters and measurement error structures. Our simulations show that, under reasonable distributional assumptions on the pharmacokinetic parameters, log(AUC) has heavy tails and log(Cmax) is skewed. Sensitivity analyses are conducted to investigate how the distribution of the standardized log(AUC) (or the standardized log(Cmax)) for a large number of simulated subjects deviates from normality if distributions of errors in the pharmacokinetic model for plasma concentrations deviate from normality and if the plasma concentration can be described by different compartmental models.

  2. Distribution Functions of Sizes and Fluxes Determined from Supra-Arcade Downflows

    NASA Technical Reports Server (NTRS)

    McKenzie, D.; Savage, S.

    2011-01-01

    The frequency distributions of sizes and fluxes of supra-arcade downflows (SADs) provide information about the process of their creation. For example, a fractal creation process may be expected to yield a power-law distribution of sizes and/or fluxes. We examine 120 cross-sectional areas and magnetic flux estimates found by Savage & McKenzie for SADs, and find that (1) the areas are consistent with a log-normal distribution and (2) the fluxes are consistent with both a log-normal and an exponential distribution. Neither set of measurements is compatible with a power-law distribution nor a normal distribution. As a demonstration of the applicability of these findings to improved understanding of reconnection, we consider a simple SAD growth scenario with minimal assumptions, capable of producing a log-normal distribution.

  3. Alternate methods for FAAT S-curve generation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaufman, A.M.

    The FAAT (Foreign Asset Assessment Team) assessment methodology attempts to derive a probability of effect as a function of incident field strength. The probability of effect is the likelihood that the stress put on a system exceeds its strength. In the FAAT methodology, both the stress and strength are random variables whose statistical properties are estimated by experts. Each random variable has two components of uncertainty: systematic and random. The systematic uncertainty drives the confidence bounds in the FAAT assessment. Its variance can be reduced by improved information. The variance of the random uncertainty is not reducible. The FAAT methodologymore » uses an assessment code called ARES to generate probability of effect curves (S-curves) at various confidence levels. ARES assumes log normal distributions for all random variables. The S-curves themselves are log normal cumulants associated with the random portion of the uncertainty. The placement of the S-curves depends on confidence bounds. The systematic uncertainty in both stress and strength is usually described by a mode and an upper and lower variance. Such a description is not consistent with the log normal assumption of ARES and an unsatisfactory work around solution is used to obtain the required placement of the S-curves at each confidence level. We have looked into this situation and have found that significant errors are introduced by this work around. These errors are at least several dB-W/cm{sup 2} at all confidence levels, but they are especially bad in the estimate of the median. In this paper, we suggest two alternate solutions for the placement of S-curves. To compare these calculational methods, we have tabulated the common combinations of upper and lower variances and generated the relevant S-curves offsets from the mode difference of stress and strength.« less

  4. WE-H-207A-03: The Universality of the Lognormal Behavior of [F-18]FLT PET SUV Measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scarpelli, M; Eickhoff, J; Perlman, S

    Purpose: Log transforming [F-18]FDG PET standardized uptake values (SUVs) has been shown to lead to normal SUV distributions, which allows utilization of powerful parametric statistical models. This study identified the optimal transformation leading to normally distributed [F-18]FLT PET SUVs from solid tumors and offers an example of how normal distributions permits analysis of non-independent/correlated measurements. Methods: Forty patients with various metastatic diseases underwent up to six FLT PET/CT scans during treatment. Tumors were identified by nuclear medicine physician and manually segmented. Average uptake was extracted for each patient giving a global SUVmean (gSUVmean) for each scan. The Shapiro-Wilk test wasmore » used to test distribution normality. One parameter Box-Cox transformations were applied to each of the six gSUVmean distributions and the optimal transformation was found by selecting the parameter that maximized the Shapiro-Wilk test statistic. The relationship between gSUVmean and a serum biomarker (VEGF) collected at imaging timepoints was determined using a linear mixed effects model (LMEM), which accounted for correlated/non-independent measurements from the same individual. Results: Untransformed gSUVmean distributions were found to be significantly non-normal (p<0.05). The optimal transformation parameter had a value of 0.3 (95%CI: −0.4 to 1.6). Given the optimal parameter was close to zero (which corresponds to log transformation), the data were subsequently log transformed. All log transformed gSUVmean distributions were normally distributed (p>0.10 for all timepoints). Log transformed data were incorporated into the LMEM. VEGF serum levels significantly correlated with gSUVmean (p<0.001), revealing log-linear relationship between SUVs and underlying biology. Conclusion: Failure to account for correlated/non-independent measurements can lead to invalid conclusions and motivated transformation to normally distributed SUVs. The log transformation was found to be close to optimal and sufficient for obtaining normally distributed FLT PET SUVs. These transformations allow utilization of powerful LMEMs when analyzing quantitative imaging metrics.« less

  5. Empirical analysis on the runners' velocity distribution in city marathons

    NASA Astrophysics Data System (ADS)

    Lin, Zhenquan; Meng, Fan

    2018-01-01

    In recent decades, much researches have been performed on human temporal activity and mobility patterns, while few investigations have been made to examine the features of the velocity distributions of human mobility patterns. In this paper, we investigated empirically the velocity distributions of finishers in New York City marathon, American Chicago marathon, Berlin marathon and London marathon. By statistical analyses on the datasets of the finish time records, we captured some statistical features of human behaviors in marathons: (1) The velocity distributions of all finishers and of partial finishers in the fastest age group both follow log-normal distribution; (2) In the New York City marathon, the velocity distribution of all male runners in eight 5-kilometer internal timing courses undergoes two transitions: from log-normal distribution at the initial stage (several initial courses) to the Gaussian distribution at the middle stage (several middle courses), and to log-normal distribution at the last stage (several last courses); (3) The intensity of the competition, which is described by the root-mean-square value of the rank changes of all runners, goes weaker from initial stage to the middle stage corresponding to the transition of the velocity distribution from log-normal distribution to Gaussian distribution, and when the competition gets stronger in the last course of the middle stage, there will come a transition from Gaussian distribution to log-normal one at last stage. This study may enrich the researches on human mobility patterns and attract attentions on the velocity features of human mobility.

  6. powerbox: Arbitrarily structured, arbitrary-dimension boxes and log-normal mocks

    NASA Astrophysics Data System (ADS)

    Murray, Steven G.

    2018-05-01

    powerbox creates density grids (or boxes) with an arbitrary two-point distribution (i.e. power spectrum). The software works in any number of dimensions, creates Gaussian or Log-Normal fields, and measures power spectra of output fields to ensure consistency. The primary motivation for creating the code was the simple creation of log-normal mock galaxy distributions, but the methodology can be used for other applications.

  7. Log-Normal Distribution of Cosmic Voids in Simulations and Mocks

    NASA Astrophysics Data System (ADS)

    Russell, E.; Pycke, J.-R.

    2017-01-01

    Following up on previous studies, we complete here a full analysis of the void size distributions of the Cosmic Void Catalog based on three different simulation and mock catalogs: dark matter (DM), haloes, and galaxies. Based on this analysis, we attempt to answer two questions: Is a three-parameter log-normal distribution a good candidate to satisfy the void size distributions obtained from different types of environments? Is there a direct relation between the shape parameters of the void size distribution and the environmental effects? In an attempt to answer these questions, we find here that all void size distributions of these data samples satisfy the three-parameter log-normal distribution whether the environment is dominated by DM, haloes, or galaxies. In addition, the shape parameters of the three-parameter log-normal void size distribution seem highly affected by environment, particularly existing substructures. Therefore, we show two quantitative relations given by linear equations between the skewness and the maximum tree depth, and between the variance of the void size distribution and the maximum tree depth, directly from the simulated data. In addition to this, we find that the percentage of voids with nonzero central density in the data sets has a critical importance. If the number of voids with nonzero central density reaches ≥3.84% in a simulation/mock sample, then a second population is observed in the void size distributions. This second population emerges as a second peak in the log-normal void size distribution at larger radius.

  8. Log-Linear Models for Gene Association

    PubMed Central

    Hu, Jianhua; Joshi, Adarsh; Johnson, Valen E.

    2009-01-01

    We describe a class of log-linear models for the detection of interactions in high-dimensional genomic data. This class of models leads to a Bayesian model selection algorithm that can be applied to data that have been reduced to contingency tables using ranks of observations within subjects, and discretization of these ranks within gene/network components. Many normalization issues associated with the analysis of genomic data are thereby avoided. A prior density based on Ewens’ sampling distribution is used to restrict the number of interacting components assigned high posterior probability, and the calculation of posterior model probabilities is expedited by approximations based on the likelihood ratio statistic. Simulation studies are used to evaluate the efficiency of the resulting algorithm for known interaction structures. Finally, the algorithm is validated in a microarray study for which it was possible to obtain biological confirmation of detected interactions. PMID:19655032

  9. Jimsphere wind and turbulence exceedance statistic

    NASA Technical Reports Server (NTRS)

    Adelfang, S. I.; Court, A.

    1972-01-01

    Exceedance statistics of winds and gusts observed over Cape Kennedy with Jimsphere balloon sensors are described. Gust profiles containing positive and negative departures, from smoothed profiles, in the wavelength ranges 100-2500, 100-1900, 100-860, and 100-460 meters were computed from 1578 profiles with four 41 weight digital high pass filters. Extreme values of the square root of gust speed are normally distributed. Monthly and annual exceedance probability distributions of normalized rms gust speeds in three altitude bands (2-7, 6-11, and 9-14 km) are log-normal. The rms gust speeds are largest in the 100-2500 wavelength band between 9 and 14 km in late winter and early spring. A study of monthly and annual exceedance probabilities and the number of occurrences per kilometer of level crossings with positive slope indicates significant variability with season, altitude, and filter configuration. A decile sampling scheme is tested and an optimum approach is suggested for drawing a relatively small random sample that represents the characteristic extreme wind speeds and shears of a large parent population of Jimsphere wind profiles.

  10. Boundary curves of individual items in the distribution of total depressive symptom scores approximate an exponential pattern in a general population.

    PubMed

    Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A; Ono, Yutaka

    2016-01-01

    Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D) questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items). The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an exponential mathematical pattern.

  11. Boundary curves of individual items in the distribution of total depressive symptom scores approximate an exponential pattern in a general population

    PubMed Central

    Kawasaki, Yohei; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A.; Ono, Yutaka

    2016-01-01

    Background Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Methods Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D) questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items). The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. Results The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. Discussion The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an exponential mathematical pattern. PMID:27761346

  12. An inexact log-normal distribution-based stochastic chance-constrained model for agricultural water quality management

    NASA Astrophysics Data System (ADS)

    Wang, Yu; Fan, Jie; Xu, Ye; Sun, Wei; Chen, Dong

    2018-05-01

    In this study, an inexact log-normal-based stochastic chance-constrained programming model was developed for solving the non-point source pollution issues caused by agricultural activities. Compared to the general stochastic chance-constrained programming model, the main advantage of the proposed model is that it allows random variables to be expressed as a log-normal distribution, rather than a general normal distribution. Possible deviations in solutions caused by irrational parameter assumptions were avoided. The agricultural system management in the Erhai Lake watershed was used as a case study, where critical system factors, including rainfall and runoff amounts, show characteristics of a log-normal distribution. Several interval solutions were obtained under different constraint-satisfaction levels, which were useful in evaluating the trade-off between system economy and reliability. The applied results show that the proposed model could help decision makers to design optimal production patterns under complex uncertainties. The successful application of this model is expected to provide a good example for agricultural management in many other watersheds.

  13. LOG-NORMAL DISTRIBUTION OF COSMIC VOIDS IN SIMULATIONS AND MOCKS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Russell, E.; Pycke, J.-R., E-mail: er111@nyu.edu, E-mail: jrp15@nyu.edu

    2017-01-20

    Following up on previous studies, we complete here a full analysis of the void size distributions of the Cosmic Void Catalog based on three different simulation and mock catalogs: dark matter (DM), haloes, and galaxies. Based on this analysis, we attempt to answer two questions: Is a three-parameter log-normal distribution a good candidate to satisfy the void size distributions obtained from different types of environments? Is there a direct relation between the shape parameters of the void size distribution and the environmental effects? In an attempt to answer these questions, we find here that all void size distributions of thesemore » data samples satisfy the three-parameter log-normal distribution whether the environment is dominated by DM, haloes, or galaxies. In addition, the shape parameters of the three-parameter log-normal void size distribution seem highly affected by environment, particularly existing substructures. Therefore, we show two quantitative relations given by linear equations between the skewness and the maximum tree depth, and between the variance of the void size distribution and the maximum tree depth, directly from the simulated data. In addition to this, we find that the percentage of voids with nonzero central density in the data sets has a critical importance. If the number of voids with nonzero central density reaches ≥3.84% in a simulation/mock sample, then a second population is observed in the void size distributions. This second population emerges as a second peak in the log-normal void size distribution at larger radius.« less

  14. Vertical changes in the probability distribution of downward irradiance within the near-surface ocean under sunny conditions

    NASA Astrophysics Data System (ADS)

    Gernez, Pierre; Stramski, Dariusz; Darecki, Miroslaw

    2011-07-01

    Time series measurements of fluctuations in underwater downward irradiance, Ed, within the green spectral band (532 nm) show that the probability distribution of instantaneous irradiance varies greatly as a function of depth within the near-surface ocean under sunny conditions. Because of intense light flashes caused by surface wave focusing, the near-surface probability distributions are highly skewed to the right and are heavy tailed. The coefficients of skewness and excess kurtosis at depths smaller than 1 m can exceed 3 and 20, respectively. We tested several probability models, such as lognormal, Gumbel, Fréchet, log-logistic, and Pareto, which are potentially suited to describe the highly skewed heavy-tailed distributions. We found that the models cannot approximate with consistently good accuracy the high irradiance values within the right tail of the experimental distribution where the probability of these values is less than 10%. This portion of the distribution corresponds approximately to light flashes with Ed > 1.5?, where ? is the time-averaged downward irradiance. However, the remaining part of the probability distribution covering all irradiance values smaller than the 90th percentile can be described with a reasonable accuracy (i.e., within 20%) with a lognormal model for all 86 measurements from the top 10 m of the ocean included in this analysis. As the intensity of irradiance fluctuations decreases with depth, the probability distribution tends toward a function symmetrical around the mean like the normal distribution. For the examined data set, the skewness and excess kurtosis assumed values very close to zero at a depth of about 10 m.

  15. Pan-European comparison of candidate distributions for climatological drought indices, SPI and SPEI

    NASA Astrophysics Data System (ADS)

    Stagge, James; Tallaksen, Lena; Gudmundsson, Lukas; Van Loon, Anne; Stahl, Kerstin

    2013-04-01

    Drought indices are vital to objectively quantify and compare drought severity, duration, and extent across regions with varied climatic and hydrologic regimes. The Standardized Precipitation Index (SPI), a well-reviewed meterological drought index recommended by the WMO, and its more recent water balance variant, the Standardized Precipitation-Evapotranspiration Index (SPEI) both rely on selection of univariate probability distributions to normalize the index, allowing for comparisons across climates. The SPI, considered a universal meteorological drought index, measures anomalies in precipitation, whereas the SPEI measures anomalies in climatic water balance (precipitation minus potential evapotranspiration), a more comprehensive measure of water availability that incorporates temperature. Many reviewers recommend use of the gamma (Pearson Type III) distribution for SPI normalization, while developers of the SPEI recommend use of the three parameter log-logistic distribution, based on point observation validation. Before the SPEI can be implemented at the pan-European scale, it is necessary to further validate the index using a range of candidate distributions to determine sensitivity to distribution selection, identify recommended distributions, and highlight those instances where a given distribution may not be valid. This study rigorously compares a suite of candidate probability distributions using WATCH Forcing Data, a global, historical (1958-2001) climate dataset based on ERA40 reanalysis with 0.5 x 0.5 degree resolution and bias-correction based on CRU-TS2.1 observations. Using maximum likelihood estimation, alternative candidate distributions are fit for the SPI and SPEI across the range of European climate zones. When evaluated at this scale, the gamma distribution for the SPI results in negatively skewed values, exaggerating the index severity of extreme dry conditions, while decreasing the index severity of extreme high precipitation. This bias is particularly notable for shorter aggregation periods (1-6 months) during the summer months in southern Europe (below 45° latitude), and can partially be attributed to distribution fitting difficulties in semi-arid regions where monthly precipitation totals cluster near zero. By contrast, the SPEI has potential for avoiding this fitting difficulty because it is not bounded by zero. However, the recommended log-logistic distribution produces index values with less variation than the standard normal distribution. Among the alternative candidate distributions, the best fit distribution and the distribution parameters vary in space and time, suggesting regional commonalities within hydroclimatic regimes, as discussed further in the presentation.

  16. Geostatistics and Bayesian updating for transmissivity estimation in a multiaquifer system in Manitoba, Canada.

    PubMed

    Kennedy, Paula L; Woodbury, Allan D

    2002-01-01

    In ground water flow and transport modeling, the heterogeneous nature of porous media has a considerable effect on the resulting flow and solute transport. Some method of generating the heterogeneous field from a limited dataset of uncertain measurements is required. Bayesian updating is one method that interpolates from an uncertain dataset using the statistics of the underlying probability distribution function. In this paper, Bayesian updating was used to determine the heterogeneous natural log transmissivity field for a carbonate and a sandstone aquifer in southern Manitoba. It was determined that the transmissivity in m2/sec followed a natural log normal distribution for both aquifers with a mean of -7.2 and - 8.0 for the carbonate and sandstone aquifers, respectively. The variograms were calculated using an estimator developed by Li and Lake (1994). Fractal nature was not evident in the variogram from either aquifer. The Bayesian updating heterogeneous field provided good results even in cases where little data was available. A large transmissivity zone in the sandstone aquifer was created by the Bayesian procedure, which is not a reflection of any deterministic consideration, but is a natural outcome of updating a prior probability distribution function with observations. The statistical model returns a result that is very reasonable; that is homogeneous in regions where little or no information is available to alter an initial state. No long range correlation trends or fractal behavior of the log-transmissivity field was observed in either aquifer over a distance of about 300 km.

  17. Rockfall travel distances theoretical distributions

    NASA Astrophysics Data System (ADS)

    Jaboyedoff, Michel; Derron, Marc-Henri; Pedrazzini, Andrea

    2017-04-01

    The probability of propagation of rockfalls is a key part of hazard assessment, because it permits to extrapolate the probability of propagation of rockfall either based on partial data or simply theoretically. The propagation can be assumed frictional which permits to describe on average the propagation by a line of kinetic energy which corresponds to the loss of energy along the path. But loss of energy can also be assumed as a multiplicative process or a purely random process. The distributions of the rockfall block stop points can be deduced from such simple models, they lead to Gaussian, Inverse-Gaussian, Log-normal or exponential negative distributions. The theoretical background is presented, and the comparisons of some of these models with existing data indicate that these assumptions are relevant. The results are either based on theoretical considerations or by fitting results. They are potentially very useful for rockfall hazard zoning and risk assessment. This approach will need further investigations.

  18. How log-normal is your country? An analysis of the statistical distribution of the exported volumes of products

    NASA Astrophysics Data System (ADS)

    Annunziata, Mario Alberto; Petri, Alberto; Pontuale, Giorgio; Zaccaria, Andrea

    2016-10-01

    We have considered the statistical distributions of the volumes of 1131 products exported by 148 countries. We have found that the form of these distributions is not unique but heavily depends on the level of development of the nation, as expressed by macroeconomic indicators like GDP, GDP per capita, total export and a recently introduced measure for countries' economic complexity called fitness. We have identified three major classes: a) an incomplete log-normal shape, truncated on the left side, for the less developed countries, b) a complete log-normal, with a wider range of volumes, for nations characterized by intermediate economy, and c) a strongly asymmetric shape for countries with a high degree of development. Finally, the log-normality hypothesis has been checked for the distributions of all the 148 countries through different tests, Kolmogorov-Smirnov and Cramér-Von Mises, confirming that it cannot be rejected only for the countries of intermediate economy.

  19. Log-Normal Turbulence Dissipation in Global Ocean Models

    NASA Astrophysics Data System (ADS)

    Pearson, Brodie; Fox-Kemper, Baylor

    2018-03-01

    Data from turbulent numerical simulations of the global ocean demonstrate that the dissipation of kinetic energy obeys a nearly log-normal distribution even at large horizontal scales O (10 km ) . As the horizontal scales of resolved turbulence are larger than the ocean is deep, the Kolmogorov-Yaglom theory for intermittency in 3D homogeneous, isotropic turbulence cannot apply; instead, the down-scale potential enstrophy cascade of quasigeostrophic turbulence should. Yet, energy dissipation obeys approximate log-normality—robustly across depths, seasons, regions, and subgrid schemes. The distribution parameters, skewness and kurtosis, show small systematic departures from log-normality with depth and subgrid friction schemes. Log-normality suggests that a few high-dissipation locations dominate the integrated energy and enstrophy budgets, which should be taken into account when making inferences from simplified models and inferring global energy budgets from sparse observations.

  20. Inverse Gaussian gamma distribution model for turbulence-induced fading in free-space optical communication.

    PubMed

    Cheng, Mingjian; Guo, Ya; Li, Jiangting; Zheng, Xiaotong; Guo, Lixin

    2018-04-20

    We introduce an alternative distribution to the gamma-gamma (GG) distribution, called inverse Gaussian gamma (IGG) distribution, which can efficiently describe moderate-to-strong irradiance fluctuations. The proposed stochastic model is based on a modulation process between small- and large-scale irradiance fluctuations, which are modeled by gamma and inverse Gaussian distributions, respectively. The model parameters of the IGG distribution are directly related to atmospheric parameters. The accuracy of the fit among the IGG, log-normal, and GG distributions with the experimental probability density functions in moderate-to-strong turbulence are compared, and results indicate that the newly proposed IGG model provides an excellent fit to the experimental data. As the receiving diameter is comparable with the atmospheric coherence radius, the proposed IGG model can reproduce the shape of the experimental data, whereas the GG and LN models fail to match the experimental data. The fundamental channel statistics of a free-space optical communication system are also investigated in an IGG-distributed turbulent atmosphere, and a closed-form expression for the outage probability of the system is derived with Meijer's G-function.

  1. Modelling of PM10 concentration for industrialized area in Malaysia: A case study in Shah Alam

    NASA Astrophysics Data System (ADS)

    N, Norazian Mohamed; Abdullah, M. M. A.; Tan, Cheng-yau; Ramli, N. A.; Yahaya, A. S.; Fitri, N. F. M. Y.

    In Malaysia, the predominant air pollutants are suspended particulate matter (SPM) and nitrogen dioxide (NO2). This research is on PM10 as they may trigger harm to human health as well as environment. Six distributions, namely Weibull, log-normal, gamma, Rayleigh, Gumbel and Frechet were chosen to model the PM10 observations at the chosen industrial area i.e. Shah Alam. One-year period hourly average data for 2006 and 2007 were used for this research. For parameters estimation, method of maximum likelihood estimation (MLE) was selected. Four performance indicators that are mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R2) and prediction accuracy (PA), were applied to determine the goodness-of-fit criteria of the distributions. The best distribution that fits with the PM10 observations in Shah Alamwas found to be log-normal distribution. The probabilities of the exceedences concentration were calculated and the return period for the coming year was predicted from the cumulative density function (cdf) obtained from the best-fit distributions. For the 2006 data, Shah Alam was predicted to exceed 150 μg/m3 for 5.9 days in 2007 with a return period of one occurrence per 62 days. For 2007, the studied area does not exceed the MAAQG of 150 μg/m3

  2. Frequency distribution of lithium in leaves of Lycium andersonii

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Romney, E.M.; Wallace, A.; Kinnear, J.

    1977-01-01

    Lycium andersonii A. Gray is an accumulator of Li. Assays were made of 200 samples of it collected from six different locations within the Northern Mojave Desert. Mean concentrations of Li varied from location to location and tended not to follow log/sub e/ normal distribution, and to follow a normal distribution only poorly. There was some negative skewness to the log/sub e/ distribution which did exist. The results imply that the variation in accumulation of Li depends upon native supply of Li. Possibly the Li supply and the ability of L. andersonii plants to accumulate it are both log/sub e/more » normally distributed. The mean leaf concentration of Li in all locations was 29 ..mu..g/g, but the maximum was 166 ..mu..g/g.« less

  3. Statistical characterization of a large geochemical database and effect of sample size

    USGS Publications Warehouse

    Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.

    2005-01-01

    The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively smaller numbers of data points showed that few elements passed standard statistical tests for normality or log-normality until sample size decreased to a few hundred data points. Large sample size enhances the power of statistical tests, and leads to rejection of most statistical hypotheses for real data sets. For large sample sizes (e.g., n > 1000), graphical methods such as histogram, stem-and-leaf, and probability plots are recommended for rough judgement of probability distribution if needed. ?? 2005 Elsevier Ltd. All rights reserved.

  4. THE DEPENDENCE OF PRESTELLAR CORE MASS DISTRIBUTIONS ON THE STRUCTURE OF THE PARENTAL CLOUD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parravano, Antonio; Sanchez, Nestor; Alfaro, Emilio J.

    2012-08-01

    The mass distribution of prestellar cores is obtained for clouds with arbitrary internal mass distributions using a selection criterion based on the thermal and turbulent Jeans mass and applied hierarchically from small to large scales. We have checked this methodology by comparing our results for a log-normal density probability distribution function with the theoretical core mass function (CMF) derived by Hennebelle and Chabrier, namely a power law at large scales and a log-normal cutoff at low scales, but our method can be applied to any mass distributions representing a star-forming cloud. This methodology enables us to connect the parental cloudmore » structure with the mass distribution of the cores and their spatial distribution, providing an efficient tool for investigating the physical properties of the molecular clouds that give rise to the prestellar core distributions observed. Simulated fractional Brownian motion (fBm) clouds with the Hurst exponent close to the value H = 1/3 give the best agreement with the theoretical CMF derived by Hennebelle and Chabrier and Chabrier's system initial mass function. Likewise, the spatial distribution of the cores derived from our methodology shows a surface density of companions compatible with those observed in Trapezium and Ophiucus star-forming regions. This method also allows us to analyze the properties of the mass distribution of cores for different realizations. We found that the variations in the number of cores formed in different realizations of fBm clouds (with the same Hurst exponent) are much larger than the expected root N statistical fluctuations, increasing with H.« less

  5. The Dependence of Prestellar Core Mass Distributions on the Structure of the Parental Cloud

    NASA Astrophysics Data System (ADS)

    Parravano, Antonio; Sánchez, Néstor; Alfaro, Emilio J.

    2012-08-01

    The mass distribution of prestellar cores is obtained for clouds with arbitrary internal mass distributions using a selection criterion based on the thermal and turbulent Jeans mass and applied hierarchically from small to large scales. We have checked this methodology by comparing our results for a log-normal density probability distribution function with the theoretical core mass function (CMF) derived by Hennebelle & Chabrier, namely a power law at large scales and a log-normal cutoff at low scales, but our method can be applied to any mass distributions representing a star-forming cloud. This methodology enables us to connect the parental cloud structure with the mass distribution of the cores and their spatial distribution, providing an efficient tool for investigating the physical properties of the molecular clouds that give rise to the prestellar core distributions observed. Simulated fractional Brownian motion (fBm) clouds with the Hurst exponent close to the value H = 1/3 give the best agreement with the theoretical CMF derived by Hennebelle & Chabrier and Chabrier's system initial mass function. Likewise, the spatial distribution of the cores derived from our methodology shows a surface density of companions compatible with those observed in Trapezium and Ophiucus star-forming regions. This method also allows us to analyze the properties of the mass distribution of cores for different realizations. We found that the variations in the number of cores formed in different realizations of fBm clouds (with the same Hurst exponent) are much larger than the expected root {\\cal N} statistical fluctuations, increasing with H.

  6. Quantifying nonstationary radioactivity concentration fluctuations near Chernobyl: A complete statistical description

    NASA Astrophysics Data System (ADS)

    Viswanathan, G. M.; Buldyrev, S. V.; Garger, E. K.; Kashpur, V. A.; Lucena, L. S.; Shlyakhter, A.; Stanley, H. E.; Tschiersch, J.

    2000-09-01

    We analyze nonstationary 137Cs atmospheric activity concentration fluctuations measured near Chernobyl after the 1986 disaster and find three new results: (i) the histogram of fluctuations is well described by a log-normal distribution; (ii) there is a pronounced spectral component with period T=1yr, and (iii) the fluctuations are long-range correlated. These findings allow us to quantify two fundamental statistical properties of the data: the probability distribution and the correlation properties of the time series. We interpret our findings as evidence that the atmospheric radionuclide resuspension processes are tightly coupled to the surrounding ecosystems and to large time scale weather patterns.

  7. Football goal distributions and extremal statistics

    NASA Astrophysics Data System (ADS)

    Greenhough, J.; Birch, P. C.; Chapman, S. C.; Rowlands, G.

    2002-12-01

    We analyse the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001. The probability density functions (PDFs) of goals scored are too heavy-tailed to be fitted over their entire ranges by Poisson or negative binomial distributions which would be expected for uncorrelated processes. Log-normal distributions cannot include zero scores and here we find that the PDFs are consistent with those arising from extremal statistics. In addition, we show that it is sufficient to model English top division and FA Cup matches in the seasons of 1970/71-2000/01 on Poisson or negative binomial distributions, as reported in analyses of earlier seasons, and that these are not consistent with extremal statistics.

  8. Optimal transformations leading to normal distributions of positron emission tomography standardized uptake values.

    PubMed

    Scarpelli, Matthew; Eickhoff, Jens; Cuna, Enrique; Perlman, Scott; Jeraj, Robert

    2018-01-30

    The statistical analysis of positron emission tomography (PET) standardized uptake value (SUV) measurements is challenging due to the skewed nature of SUV distributions. This limits utilization of powerful parametric statistical models for analyzing SUV measurements. An ad-hoc approach, which is frequently used in practice, is to blindly use a log transformation, which may or may not result in normal SUV distributions. This study sought to identify optimal transformations leading to normally distributed PET SUVs extracted from tumors and assess the effects of therapy on the optimal transformations. The optimal transformation for producing normal distributions of tumor SUVs was identified by iterating the Box-Cox transformation parameter (λ) and selecting the parameter that maximized the Shapiro-Wilk P-value. Optimal transformations were identified for tumor SUV max distributions at both pre and post treatment. This study included 57 patients that underwent 18 F-fluorodeoxyglucose ( 18 F-FDG) PET scans (publically available dataset). In addition, to test the generality of our transformation methodology, we included analysis of 27 patients that underwent 18 F-Fluorothymidine ( 18 F-FLT) PET scans at our institution. After applying the optimal Box-Cox transformations, neither the pre nor the post treatment 18 F-FDG SUV distributions deviated significantly from normality (P  >  0.10). Similar results were found for 18 F-FLT PET SUV distributions (P  >  0.10). For both 18 F-FDG and 18 F-FLT SUV distributions, the skewness and kurtosis increased from pre to post treatment, leading to a decrease in the optimal Box-Cox transformation parameter from pre to post treatment. There were types of distributions encountered for both 18 F-FDG and 18 F-FLT where a log transformation was not optimal for providing normal SUV distributions. Optimization of the Box-Cox transformation, offers a solution for identifying normal SUV transformations for when the log transformation is insufficient. The log transformation is not always the appropriate transformation for producing normally distributed PET SUVs.

  9. Optimal transformations leading to normal distributions of positron emission tomography standardized uptake values

    NASA Astrophysics Data System (ADS)

    Scarpelli, Matthew; Eickhoff, Jens; Cuna, Enrique; Perlman, Scott; Jeraj, Robert

    2018-02-01

    The statistical analysis of positron emission tomography (PET) standardized uptake value (SUV) measurements is challenging due to the skewed nature of SUV distributions. This limits utilization of powerful parametric statistical models for analyzing SUV measurements. An ad-hoc approach, which is frequently used in practice, is to blindly use a log transformation, which may or may not result in normal SUV distributions. This study sought to identify optimal transformations leading to normally distributed PET SUVs extracted from tumors and assess the effects of therapy on the optimal transformations. Methods. The optimal transformation for producing normal distributions of tumor SUVs was identified by iterating the Box-Cox transformation parameter (λ) and selecting the parameter that maximized the Shapiro-Wilk P-value. Optimal transformations were identified for tumor SUVmax distributions at both pre and post treatment. This study included 57 patients that underwent 18F-fluorodeoxyglucose (18F-FDG) PET scans (publically available dataset). In addition, to test the generality of our transformation methodology, we included analysis of 27 patients that underwent 18F-Fluorothymidine (18F-FLT) PET scans at our institution. Results. After applying the optimal Box-Cox transformations, neither the pre nor the post treatment 18F-FDG SUV distributions deviated significantly from normality (P  >  0.10). Similar results were found for 18F-FLT PET SUV distributions (P  >  0.10). For both 18F-FDG and 18F-FLT SUV distributions, the skewness and kurtosis increased from pre to post treatment, leading to a decrease in the optimal Box-Cox transformation parameter from pre to post treatment. There were types of distributions encountered for both 18F-FDG and 18F-FLT where a log transformation was not optimal for providing normal SUV distributions. Conclusion. Optimization of the Box-Cox transformation, offers a solution for identifying normal SUV transformations for when the log transformation is insufficient. The log transformation is not always the appropriate transformation for producing normally distributed PET SUVs.

  10. Failure probability under parameter uncertainty.

    PubMed

    Gerrard, R; Tsanakas, A

    2011-05-01

    In many problems of risk analysis, failure is equivalent to the event of a random risk factor exceeding a given threshold. Failure probabilities can be controlled if a decisionmaker is able to set the threshold at an appropriate level. This abstract situation applies, for example, to environmental risks with infrastructure controls; to supply chain risks with inventory controls; and to insurance solvency risks with capital controls. However, uncertainty around the distribution of the risk factor implies that parameter error will be present and the measures taken to control failure probabilities may not be effective. We show that parameter uncertainty increases the probability (understood as expected frequency) of failures. For a large class of loss distributions, arising from increasing transformations of location-scale families (including the log-normal, Weibull, and Pareto distributions), the article shows that failure probabilities can be exactly calculated, as they are independent of the true (but unknown) parameters. Hence it is possible to obtain an explicit measure of the effect of parameter uncertainty on failure probability. Failure probability can be controlled in two different ways: (1) by reducing the nominal required failure probability, depending on the size of the available data set, and (2) by modifying of the distribution itself that is used to calculate the risk control. Approach (1) corresponds to a frequentist/regulatory view of probability, while approach (2) is consistent with a Bayesian/personalistic view. We furthermore show that the two approaches are consistent in achieving the required failure probability. Finally, we briefly discuss the effects of data pooling and its systemic risk implications. © 2010 Society for Risk Analysis.

  11. Multiwavelength Studies of Rotating Radio Transients

    NASA Astrophysics Data System (ADS)

    Miller, Joshua J.

    Seven years ago, a new class of pulsars called the Rotating Radio Transients (RRATs) was discovered with the Parkes radio telescope in Australia (McLaughlin et al., 2006). These neutron stars are characterized by strong radio bursts at repeatable dispersion measures, but not detectable using standard periodicity-search algorithms. We now know of roughly 100 of these objects, discovered in new surveys and re-analysis of archival survey data. They generally have longer periods than those of the normal pulsar population, and several have high magnetic fields, similar to those other neutron star populations like the X-ray bright magnetars. However, some of the RRATs have spin-down properties very similar to those of normal pulsars, making it difficult to determine the cause of their unusual emission and possible evolutionary relationships between them and other classes of neutron stars. We have calculated single-pulse flux densities for eight RRAT sources observed using the Parkes radio telescope. Like normal pulsars, the pulse amplitude distributions are well described by log-normal probability distribution functions, though two show evidence for an additional power-law tail. Spectral indices are calculated for the seven RRATs which were detected at multiple frequencies. These RRATs have a mean spectral index of = -3.2(7), or = -3.1(1) when using mean flux densities derived from fitting log-normal probability distribution functions to the pulse amplitude distributions, suggesting that the RRATs have steeper spectra than normal pulsars. When only considering the three RRATs for which we have a wide range of observing frequencies, however, and become --1.7(1) and --2.0(1), respectively, and are roughly consistent with those measured for normal pulsars. In all cases, these spectral indices exclude magnetar-like flat spectra. For PSR J1819--1458, the RRAT with the highest bursting rate, pulses were detected at 685 and 3029 MHz in simultaneous observations and have a spectral index consistent with our other analysis. We also present the results of simultaneous radio and X-ray observations of PSR J1819--1458. Our 94-ks XMM-Newton observation of the high magnetic field (~5x109 T) pulsar reveals a blackbody spectrum ( kT~130 eV) with a broad absorption feature, possibly composed of two lines at ~1.0 and ~1.3 keV. We performed a correlation analysis of the X-ray photons with radio pulses detected in 16.2 hours of simultaneous observations at 1--2 GHz with the Green Bank, Effelsberg, and Parkes telescopes, respectively. Both the detected X-ray photons and radio pulses appear to be randomly distributed in time. We find tentative evidence for a correlation between the detected radio pulses and X-ray photons on timescales of less than 10 pulsar spin periods, with the probability of this occurring by chance being 0.46%. This suggests that the physical process producing the radio pulses may also heat the polar cap.

  12. The distribution of the intervals between neural impulses in the maintained discharges of retinal ganglion cells.

    PubMed

    Levine, M W

    1991-01-01

    Simulated neural impulse trains were generated by a digital realization of the integrate-and-fire model. The variability in these impulse trains had as its origin a random noise of specified distribution. Three different distributions were used: the normal (Gaussian) distribution (no skew, normokurtic), a first-order gamma distribution (positive skew, leptokurtic), and a uniform distribution (no skew, platykurtic). Despite these differences in the distribution of the variability, the distributions of the intervals between impulses were nearly indistinguishable. These inter-impulse distributions were better fit with a hyperbolic gamma distribution than a hyperbolic normal distribution, although one might expect a better approximation for normally distributed inverse intervals. Consideration of why the inter-impulse distribution is independent of the distribution of the causative noise suggests two putative interval distributions that do not depend on the assumed noise distribution: the log normal distribution, which is predicated on the assumption that long intervals occur with the joint probability of small input values, and the random walk equation, which is the diffusion equation applied to a random walk model of the impulse generating process. Either of these equations provides a more satisfactory fit to the simulated impulse trains than the hyperbolic normal or hyperbolic gamma distributions. These equations also provide better fits to impulse trains derived from the maintained discharges of ganglion cells in the retinae of cats or goldfish. It is noted that both equations are free from the constraint that the coefficient of variation (CV) have a maximum of unity.(ABSTRACT TRUNCATED AT 250 WORDS)

  13. Computation of distribution of minimum resolution for log-normal distribution of chromatographic peak heights.

    PubMed

    Davis, Joe M

    2011-10-28

    General equations are derived for the distribution of minimum resolution between two chromatographic peaks, when peak heights in a multi-component chromatogram follow a continuous statistical distribution. The derivation draws on published theory by relating the area under the distribution of minimum resolution to the area under the distribution of the ratio of peak heights, which in turn is derived from the peak-height distribution. Two procedures are proposed for the equations' numerical solution. The procedures are applied to the log-normal distribution, which recently was reported to describe the distribution of component concentrations in three complex natural mixtures. For published statistical parameters of these mixtures, the distribution of minimum resolution is similar to that for the commonly assumed exponential distribution of peak heights used in statistical-overlap theory. However, these two distributions of minimum resolution can differ markedly, depending on the scale parameter of the log-normal distribution. Theory for the computation of the distribution of minimum resolution is extended to other cases of interest. With the log-normal distribution of peak heights as an example, the distribution of minimum resolution is computed when small peaks are lost due to noise or detection limits, and when the height of at least one peak is less than an upper limit. The distribution of minimum resolution shifts slightly to lower resolution values in the first case and to markedly larger resolution values in the second one. The theory and numerical procedure are confirmed by Monte Carlo simulation. Copyright © 2011 Elsevier B.V. All rights reserved.

  14. Selecting the right statistical model for analysis of insect count data by using information theoretic measures.

    PubMed

    Sileshi, G

    2006-10-01

    Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stål and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.

  15. On generalisations of the log-Normal distribution by means of a new product definition in the Kapteyn process

    NASA Astrophysics Data System (ADS)

    Duarte Queirós, Sílvio M.

    2012-07-01

    We discuss the modification of the Kapteyn multiplicative process using the q-product of Borges [E.P. Borges, A possible deformed algebra and calculus inspired in nonextensive thermostatistics, Physica A 340 (2004) 95]. Depending on the value of the index q a generalisation of the log-Normal distribution is yielded. Namely, the distribution increases the tail for small (when q<1) or large (when q>1) values of the variable upon analysis. The usual log-Normal distribution is retrieved when q=1, which corresponds to the traditional Kapteyn multiplicative process. The main statistical features of this distribution as well as related random number generators and tables of quantiles of the Kolmogorov-Smirnov distance are presented. Finally, we illustrate the validity of this scenario by describing a set of variables of biological and financial origin.

  16. Binary data corruption due to a Brownian agent

    NASA Astrophysics Data System (ADS)

    Newman, T. J.; Triampo, Wannapong

    1999-05-01

    We introduce a model of binary data corruption induced by a Brownian agent (active random walker) on a d-dimensional lattice. A continuum formulation allows the exact calculation of several quantities related to the density of corrupted bits ρ, for example, the mean of ρ and the density-density correlation function. Excellent agreement is found with the results from numerical simulations. We also calculate the probability distribution of ρ in d=1, which is found to be log normal, indicating that the system is governed by extreme fluctuations.

  17. Statistical methods of fracture characterization using acoustic borehole televiewer log interpretation

    NASA Astrophysics Data System (ADS)

    Massiot, Cécile; Townend, John; Nicol, Andrew; McNamara, David D.

    2017-08-01

    Acoustic borehole televiewer (BHTV) logs provide measurements of fracture attributes (orientations, thickness, and spacing) at depth. Orientation, censoring, and truncation sampling biases similar to those described for one-dimensional outcrop scanlines, and other logging or drilling artifacts specific to BHTV logs, can affect the interpretation of fracture attributes from BHTV logs. K-means, fuzzy K-means, and agglomerative clustering methods provide transparent means of separating fracture groups on the basis of their orientation. Fracture spacing is calculated for each of these fracture sets. Maximum likelihood estimation using truncated distributions permits the fitting of several probability distributions to the fracture attribute data sets within truncation limits, which can then be extrapolated over the entire range where they naturally occur. Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC) statistical information criteria rank the distributions by how well they fit the data. We demonstrate these attribute analysis methods with a data set derived from three BHTV logs acquired from the high-temperature Rotokawa geothermal field, New Zealand. Varying BHTV log quality reduces the number of input data points, but careful selection of the quality levels where fractures are deemed fully sampled increases the reliability of the analysis. Spacing data analysis comprising up to 300 data points and spanning three orders of magnitude can be approximated similarly well (similar AIC rankings) with several distributions. Several clustering configurations and probability distributions can often characterize the data at similar levels of statistical criteria. Thus, several scenarios should be considered when using BHTV log data to constrain numerical fracture models.

  18. Log-normal distribution of the trace element data results from a mixture of stocahstic input and deterministic internal dynamics.

    PubMed

    Usuda, Kan; Kono, Koichi; Dote, Tomotaro; Shimizu, Hiroyasu; Tominaga, Mika; Koizumi, Chisato; Nakase, Emiko; Toshina, Yumi; Iwai, Junko; Kawasaki, Takashi; Akashi, Mitsuya

    2002-04-01

    In previous article, we showed a log-normal distribution of boron and lithium in human urine. This type of distribution is common in both biological and nonbiological applications. It can be observed when the effects of many independent variables are combined, each of which having any underlying distribution. Although elemental excretion depends on many variables, the one-compartment open model following a first-order process can be used to explain the elimination of elements. The rate of excretion is proportional to the amount present of any given element; that is, the same percentage of an existing element is eliminated per unit time, and the element concentration is represented by a deterministic negative power function of time in the elimination time-course. Sampling is of a stochastic nature, so the dataset of time variables in the elimination phase when the sample was obtained is expected to show Normal distribution. The time variable appears as an exponent of the power function, so a concentration histogram is that of an exponential transformation of Normally distributed time. This is the reason why the element concentration shows a log-normal distribution. The distribution is determined not by the element concentration itself, but by the time variable that defines the pharmacokinetic equation.

  19. Wealth of the world's richest publicly traded companies per industry and per employee: Gamma, Log-normal and Pareto power-law as universal distributions?

    NASA Astrophysics Data System (ADS)

    Soriano-Hernández, P.; del Castillo-Mussot, M.; Campirán-Chávez, I.; Montemayor-Aldrete, J. A.

    2017-04-01

    Forbes Magazine published its list of leading or strongest publicly-traded two thousand companies in the world (G-2000) based on four independent metrics: sales or revenues, profits, assets and market value. Every one of these wealth metrics yields particular information on the corporate size or wealth size of each firm. The G-2000 cumulative probability wealth distribution per employee (per capita) for all four metrics exhibits a two-class structure: quasi-exponential in the lower part, and a Pareto power-law in the higher part. These two-class structure per capita distributions are qualitatively similar to income and wealth distributions in many countries of the world, but the fraction of firms per employee within the high-class Pareto is about 49% in sales per employee, and 33% after averaging on the four metrics, whereas in countries the fraction of rich agents in the Pareto zone is less than 10%. The quasi-exponential zone can be adjusted by Gamma or Log-normal distributions. On the other hand, Forbes classifies the G-2000 firms in 82 different industries or economic activities. Within each industry, the wealth distribution per employee also follows a two-class structure, but when the aggregate wealth of firms in each industry for the four metrics is divided by the total number of employees in that industry, then the 82 points of the aggregate wealth distribution by industry per employee can be well adjusted by quasi-exponential curves for the four metrics.

  20. Estimation of the incubation period of invasive aspergillosis by survival models in acute myeloid leukemia patients.

    PubMed

    Bénet, Thomas; Voirin, Nicolas; Nicolle, Marie-Christine; Picot, Stephane; Michallet, Mauricette; Vanhems, Philippe

    2013-02-01

    The duration of the incubation of invasive aspergillosis (IA) remains unknown. The objective of this investigation was to estimate the time interval between aplasia onset and that of IA symptoms in acute myeloid leukemia (AML) patients. A single-centre prospective survey (2004-2009) included all patients with AML and probable/proven IA. Parametric survival models were fitted to the distribution of the time intervals between aplasia onset and IA. Overall, 53 patients had IA after aplasia, with the median observed time interval between the two being 15 days. Based on log-normal distribution, the median estimated IA incubation period was 14.6 days (95% CI; 12.8-16.5 days).

  1. Scoring in genetically modified organism proficiency tests based on log-transformed results.

    PubMed

    Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

    2006-01-01

    The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.

  2. Parameter estimation and forecasting for multiplicative log-normal cascades.

    PubMed

    Leövey, Andrés E; Lux, Thomas

    2012-04-01

    We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing et al. [Physica D 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica D 193, 195 (2004)] and Kiyono et al. [Phys. Rev. E 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono et al.'s procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.

  3. A comparison of Probability Of Detection (POD) data determined using different statistical methods

    NASA Astrophysics Data System (ADS)

    Fahr, A.; Forsyth, D.; Bullock, M.

    1993-12-01

    Different statistical methods have been suggested for determining probability of detection (POD) data for nondestructive inspection (NDI) techniques. A comparative assessment of various methods of determining POD was conducted using results of three NDI methods obtained by inspecting actual aircraft engine compressor disks which contained service induced cracks. The study found that the POD and 95 percent confidence curves as a function of crack size as well as the 90/95 percent crack length vary depending on the statistical method used and the type of data. The distribution function as well as the parameter estimation procedure used for determining POD and the confidence bound must be included when referencing information such as the 90/95 percent crack length. The POD curves and confidence bounds determined using the range interval method are very dependent on information that is not from the inspection data. The maximum likelihood estimators (MLE) method does not require such information and the POD results are more reasonable. The log-logistic function appears to model POD of hit/miss data relatively well and is easy to implement. The log-normal distribution using MLE provides more realistic POD results and is the preferred method. Although it is more complicated and slower to calculate, it can be implemented on a common spreadsheet program.

  4. The Tail Exponent for Stock Returns in Bursa Malaysia for 2003-2008

    NASA Astrophysics Data System (ADS)

    Rusli, N. H.; Gopir, G.; Usang, M. D.

    2010-07-01

    A developed discipline of econophysics that has been introduced is exhibiting the application of mathematical tools that are usually applied to the physical models for the study of financial models. In this study, an analysis of the time series behavior of several blue chip and penny stock companies in Main Market of Bursa Malaysia has been performed. Generally, the basic quantity being used is the relative price changes or is called the stock price returns, contains daily-sampled data from the beginning of 2003 until the end of 2008, containing 1555 trading days recorded. The aim of this paper is to investigate the tail exponent in tails of the distribution for blue chip stocks and penny stocks financial returns in six years period. By using a standard regression method, it is found that the distribution performed double scaling on the log-log plot of the cumulative probability of the normalized returns. Thus we calculate α for a small scale return as well as large scale return. Based on the result obtained, it is found that the power-law behavior for the probability density functions of the stock price absolute returns P(z)˜z-α with values lying inside and outside the Lévy stable regime with values α>2. All the results were discussed in detail.

  5. A New Bond Albedo for Performing Orbital Debris Brightness to Size Transformations

    NASA Technical Reports Server (NTRS)

    Mulrooney, Mark K.; Matney, Mark J.

    2008-01-01

    We have developed a technique for estimating the intrinsic size distribution of orbital debris objects via optical measurements alone. The process is predicated on the empirically observed power-law size distribution of debris (as indicated by radar RCS measurements) and the log-normal probability distribution of optical albedos as ascertained from phase (Lambertian) and range-corrected telescopic brightness measurements. Since the observed distribution of optical brightness is the product integral of the size distribution of the parent [debris] population with the albedo probability distribution, it is a straightforward matter to transform a given distribution of optical brightness back to a size distribution by the appropriate choice of a single albedo value. This is true because the integration of a powerlaw with a log-normal distribution (Fredholm Integral of the First Kind) yields a Gaussian-blurred power-law distribution with identical power-law exponent. Application of a single albedo to this distribution recovers a simple power-law [in size] which is linearly offset from the original distribution by a constant whose value depends on the choice of the albedo. Significantly, there exists a unique Bond albedo which, when applied to an observed brightness distribution, yields zero offset and therefore recovers the original size distribution. For physically realistic powerlaws of negative slope, the proper choice of albedo recovers the parent size distribution by compensating for the observational bias caused by the large number of small objects that appear anomalously large (bright) - and thereby skew the small population upward by rising above the detection threshold - and the lower number of large objects that appear anomalously small (dim). Based on this comprehensive analysis, a value of 0.13 should be applied to all orbital debris albedo-based brightness-to-size transformations regardless of data source. Its prima fascia genesis, derived and constructed from the current RCS to size conversion methodology (SiBAM Size-Based Estimation Model) and optical data reduction standards, assures consistency in application with the prior canonical value of 0.1. Herein we present the empirical and mathematical arguments for this approach and by example apply it to a comprehensive set of photometric data acquired via NASA's Liquid Mirror Telescopes during the 2000-2001 observing season.

  6. Performance analysis of MIMO wireless optical communication system with Q-ary PPM over correlated log-normal fading channel

    NASA Astrophysics Data System (ADS)

    Wang, Huiqin; Wang, Xue; Lynette, Kibe; Cao, Minghua

    2018-06-01

    The performance of multiple-input multiple-output wireless optical communication systems that adopt Q-ary pulse position modulation over spatial correlated log-normal fading channel is analyzed in terms of its un-coded bit error rate and ergodic channel capacity. The analysis is based on the Wilkinson's method which approximates the distribution of a sum of correlated log-normal random variables to a log-normal random variable. The analytical and simulation results corroborate the increment of correlation coefficients among sub-channels lead to system performance degradation. Moreover, the receiver diversity has better performance in resistance of spatial correlation caused channel fading.

  7. Ventilation-perfusion distribution in normal subjects.

    PubMed

    Beck, Kenneth C; Johnson, Bruce D; Olson, Thomas P; Wilson, Theodore A

    2012-09-01

    Functional values of LogSD of the ventilation distribution (σ(V)) have been reported previously, but functional values of LogSD of the perfusion distribution (σ(q)) and the coefficient of correlation between ventilation and perfusion (ρ) have not been measured in humans. Here, we report values for σ(V), σ(q), and ρ obtained from wash-in data for three gases, helium and two soluble gases, acetylene and dimethyl ether. Normal subjects inspired gas containing the test gases, and the concentrations of the gases at end-expiration during the first 10 breaths were measured with the subjects at rest and at increasing levels of exercise. The regional distribution of ventilation and perfusion was described by a bivariate log-normal distribution with parameters σ(V), σ(q), and ρ, and these parameters were evaluated by matching the values of expired gas concentrations calculated for this distribution to the measured values. Values of cardiac output and LogSD ventilation/perfusion (Va/Q) were obtained. At rest, σ(q) is high (1.08 ± 0.12). With the onset of ventilation, σ(q) decreases to 0.85 ± 0.09 but remains higher than σ(V) (0.43 ± 0.09) at all exercise levels. Rho increases to 0.87 ± 0.07, and the value of LogSD Va/Q for light and moderate exercise is primarily the result of the difference between the magnitudes of σ(q) and σ(V). With known values for the parameters, the bivariate distribution describes the comprehensive distribution of ventilation and perfusion that underlies the distribution of the Va/Q ratio.

  8. Likelihood-based confidence intervals for estimating floods with given return periods

    NASA Astrophysics Data System (ADS)

    Martins, Eduardo Sávio P. R.; Clarke, Robin T.

    1993-06-01

    This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.

  9. Load-Based Lower Neck Injury Criteria for Females from Rear Impact from Cadaver Experiments.

    PubMed

    Yoganandan, Narayan; Pintar, Frank A; Banerjee, Anjishnu

    2017-05-01

    The objectives of this study were to derive lower neck injury metrics/criteria and injury risk curves for the force, moment, and interaction criterion in rear impacts for females. Biomechanical data were obtained from previous intact and isolated post mortem human subjects and head-neck complexes subjected to posteroanterior accelerative loading. Censored data were used in the survival analysis model. The primary shear force, sagittal bending moment, and interaction (lower neck injury criterion, LN ic ) metrics were significant predictors of injury. The most optimal distribution was selected (Weibulll, log normal, or log logistic) using the Akaike information criterion according to the latest ISO recommendations for deriving risk curves. The Kolmogorov-Smirnov test was used to quantify robustness of the assumed parametric model. The intercepts for the interaction index were extracted from the primary risk curves. Normalized confidence interval sizes (NCIS) were reported at discrete probability levels, along with the risk curves and 95% confidence intervals. The mean force of 214 N, moment of 54 Nm, and 0.89 LN ic were associated with a five percent probability of injury. The NCIS for these metrics were 0.90, 0.95, and 0.85. These preliminary results can be used as a first step in the definition of lower neck injury criteria for women under posteroanterior accelerative loading in crashworthiness evaluations.

  10. Optimized lower leg injury probability curves from postmortem human subject tests under axial impacts.

    PubMed

    Yoganandan, Narayan; Arun, Mike W J; Pintar, Frank A; Szabo, Aniko

    2014-01-01

    Derive optimum injury probability curves to describe human tolerance of the lower leg using parametric survival analysis. The study reexamined lower leg postmortem human subjects (PMHS) data from a large group of specimens. Briefly, axial loading experiments were conducted by impacting the plantar surface of the foot. Both injury and noninjury tests were included in the testing process. They were identified by pre- and posttest radiographic images and detailed dissection following the impact test. Fractures included injuries to the calcaneus and distal tibia-fibula complex (including pylon), representing severities at the Abbreviated Injury Score (AIS) level 2+. For the statistical analysis, peak force was chosen as the main explanatory variable and the age was chosen as the covariable. Censoring statuses depended on experimental outcomes. Parameters from the parametric survival analysis were estimated using the maximum likelihood approach and the dfbetas statistic was used to identify overly influential samples. The best fit from the Weibull, log-normal, and log-logistic distributions was based on the Akaike information criterion. Plus and minus 95% confidence intervals were obtained for the optimum injury probability distribution. The relative sizes of the interval were determined at predetermined risk levels. Quality indices were described at each of the selected probability levels. The mean age, stature, and weight were 58.2±15.1 years, 1.74±0.08 m, and 74.9±13.8 kg, respectively. Excluding all overly influential tests resulted in the tightest confidence intervals. The Weibull distribution was the most optimum function compared to the other 2 distributions. A majority of quality indices were in the good category for this optimum distribution when results were extracted for 25-, 45- and 65-year-olds at 5, 25, and 50% risk levels age groups for lower leg fracture. For 25, 45, and 65 years, peak forces were 8.1, 6.5, and 5.1 kN at 5% risk; 9.6, 7.7, and 6.1 kN at 25% risk; and 10.4, 8.3, and 6.6 kN at 50% risk, respectively. This study derived axial loading-induced injury risk curves based on survival analysis using peak force and specimen age; adopting different censoring schemes; considering overly influential samples in the analysis; and assessing the quality of the distribution at discrete probability levels. Because procedures used in the present survival analysis are accepted by international automotive communities, current optimum human injury probability distributions can be used at all risk levels with more confidence in future crashworthiness applications for automotive and other disciplines.

  11. [Quantitative study of diesel/CNG buses exhaust particulate size distribution in a road tunnel].

    PubMed

    Zhu, Chun; Zhang, Xu

    2010-10-01

    Vehicle emission is one of main sources of fine/ultra-fine particles in many cities. This study firstly presents daily mean particle size distributions of mixed diesel/CNG buses traffic flow by 4 days consecutive real world measurement in an Australia road tunnel. Emission factors (EFs) of particle size distribution of diesel buses and CNG buses are obtained by MLR methods, particle distributions of diesel buses and CNG buses are observed as single accumulation mode and nuclei-mode separately. Particle size distributions of mixed traffic flow are decomposed by two log-normal fitting curves for each 30 min interval mean scans, the degrees of fitting between combined fitting curves and corresponding in-situ scans for totally 90 fitting scans are from 0.972 to 0.998. Finally particle size distributions of diesel buses and CNG buses are quantified by statistical whisker-box charts. For log-normal particle size distribution of diesel buses, accumulation mode diameters are 74.5-86.5 nm, geometric standard deviations are 1.88-2.05. As to log-normal particle size distribution of CNG buses, nuclei-mode diameters are 19.9-22.9 nm, geometric standard deviations are 1.27-1.3.

  12. Threshold detection in an on-off binary communications channel with atmospheric scintillation

    NASA Technical Reports Server (NTRS)

    Webb, W. E.; Marino, J. T., Jr.

    1974-01-01

    The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-emperical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. Bit error probabilities for non-optimum threshold detection system were also investigated.

  13. Threshold detection in an on-off binary communications channel with atmospheric scintillation

    NASA Technical Reports Server (NTRS)

    Webb, W. E.

    1975-01-01

    The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-empirical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. The bit error probabilities for nonoptimum threshold detection systems were also investigated.

  14. Bivariate normal, conditional and rectangular probabilities: A computer program with applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.; Ashwworth, G. R.; Winter, W. R.

    1980-01-01

    Some results for the bivariate normal distribution analysis are presented. Computer programs for conditional normal probabilities, marginal probabilities, as well as joint probabilities for rectangular regions are given: routines for computing fractile points and distribution functions are also presented. Some examples from a closed circuit television experiment are included.

  15. Distribution of transvascular pathway sizes through the pulmonary microvascular barrier.

    PubMed

    McNamee, J E

    1987-01-01

    Mathematical models of solute and water exchange in the lung have been helpful in understanding factors governing the volume flow rate and composition of pulmonary lymph. As experimental data and models become more encompassing, parameter identification becomes more difficult. Pore sizes in these models should approach and eventually become equivalent to actual physiological pathway sizes as more complex and accurate models are tried. However, pore sizes and numbers vary from model to model as new pathway sizes are added. This apparent inconsistency of pore sizes can be explained if it is assumed that the pulmonary blood-lymph barrier is widely heteroporous, for example, being composed of a continuous distribution of pathway sizes. The sieving characteristics of the pulmonary barrier are reproduced by a log normal distribution of pathway sizes (log mean = -0.20, log s.d. = 1.05). A log normal distribution of pathways in the microvascular barrier is shown to follow from a rather general assumption about the nature of the pulmonary endothelial junction.

  16. Gluten-containing grains skew gluten assessment in oats due to sample grind non-homogeneity.

    PubMed

    Fritz, Ronald D; Chen, Yumin; Contreras, Veronica

    2017-02-01

    Oats are easily contaminated with gluten-rich kernels of wheat, rye and barley. These contaminants are like gluten 'pills', shown here to skew gluten analysis results. Using R-Biopharm R5 ELISA, we quantified gluten in gluten-free oatmeal servings from an in-market survey. For samples with a 5-20ppm reading on a first test, replicate analyses provided results ranging <5ppm to >160ppm. This suggests sample grinding may inadequately disperse gluten to allow a single accurate gluten assessment. To ascertain this, and characterize the distribution of 0.25-g gluten test results for kernel contaminated oats, twelve 50g samples of pure oats, each spiked with a wheat kernel, showed that 0.25g test results followed log-normal-like distributions. With this, we estimate probabilities of mis-assessment for a 'single measure/sample' relative to the <20ppm regulatory threshold, and derive an equation relating the probability of mis-assessment to sample average gluten content. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Statistical analysis of PM₁₀ concentrations at different locations in Malaysia.

    PubMed

    Sansuddin, Nurulilyana; Ramli, Nor Azam; Yahaya, Ahmad Shukri; Yusof, Noor Faizah Fitri Md; Ghazali, Nurul Adyani; Madhoun, Wesam Ahmed Al

    2011-09-01

    Malaysia has experienced several haze events since the 1980s as a consequence of the transboundary movement of air pollutants emitted from forest fires and open burning activities. Hazy episodes can result from local activities and be categorized as "localized haze". General probability distributions (i.e., gamma and log-normal) were chosen to analyze the PM(10) concentrations data at two different types of locations in Malaysia: industrial (Johor Bahru and Nilai) and residential (Kota Kinabalu and Kuantan). These areas were chosen based on their frequently high PM(10) concentration readings. The best models representing the areas were chosen based on their performance indicator values. The best distributions provided the probability of exceedances and the return period between the actual and predicted concentrations based on the threshold limit given by the Malaysian Ambient Air Quality Guidelines (24-h average of 150 μg/m(3)) for PM(10) concentrations. The short-term prediction for PM(10) exceedances in 14 days was obtained using the autoregressive model.

  18. Parameter estimation and forecasting for multiplicative log-normal cascades

    NASA Astrophysics Data System (ADS)

    Leövey, Andrés E.; Lux, Thomas

    2012-04-01

    We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing [Physica DPDNPDT0167-278910.1016/0167-2789(90)90035-N 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica DPDNPDT0167-278910.1016/j.physd.2004.01.020 193, 195 (2004)] and Kiyono [Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.76.041113 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono 's procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.

  19. Characterization of the spatial variability of channel morphology

    USGS Publications Warehouse

    Moody, J.A.; Troutman, B.M.

    2002-01-01

    The spatial variability of two fundamental morphological variables is investigated for rivers having a wide range of discharge (five orders of magnitude). The variables, water-surface width and average depth, were measured at 58 to 888 equally spaced cross-sections in channel links (river reaches between major tributaries). These measurements provide data to characterize the two-dimensional structure of a channel link which is the fundamental unit of a channel network. The morphological variables have nearly log-normal probability distributions. A general relation was determined which relates the means of the log-transformed variables to the logarithm of discharge similar to previously published downstream hydraulic geometry relations. The spatial variability of the variables is described by two properties: (1) the coefficient of variation which was nearly constant (0.13-0.42) over a wide range of discharge; and (2) the integral length scale in the downstream direction which was approximately equal to one to two mean channel widths. The joint probability distribution of the morphological variables in the downstream direction was modelled as a first-order, bivariate autoregressive process. This model accounted for up to 76 per cent of the total variance. The two-dimensional morphological variables can be scaled such that the channel width-depth process is independent of discharge. The scaling properties will be valuable to modellers of both basin and channel dynamics. Published in 2002 John Wiley and Sons, Ltd.

  20. Comparison of parametric and bootstrap method in bioequivalence test.

    PubMed

    Ahn, Byung-Jin; Yim, Dong-Seok

    2009-10-01

    The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption.

  1. Comparison of Parametric and Bootstrap Method in Bioequivalence Test

    PubMed Central

    Ahn, Byung-Jin

    2009-01-01

    The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption. PMID:19915699

  2. ROBUST: an interactive FORTRAN-77 package for exploratory data analysis using parametric, ROBUST and nonparametric location and scale estimates, data transformations, normality tests, and outlier assessment

    NASA Astrophysics Data System (ADS)

    Rock, N. M. S.

    ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures helps to detect errors in data as well as to assess data-distributions themselves.

  3. Muscle categorization using PDF estimation and Naive Bayes classification.

    PubMed

    Adel, Tameem M; Smith, Benn E; Stashuk, Daniel W

    2012-01-01

    The structure of motor unit potentials (MUPs) and their times of occurrence provide information about the motor units (MUs) that created them. As such, electromyographic (EMG) data can be used to categorize muscles as normal or suffering from a neuromuscular disease. Using pattern discovery (PD) allows clinicians to understand the rationale underlying a certain muscle characterization; i.e. it is transparent. Discretization is required in PD, which leads to some loss in accuracy. In this work, characterization techniques that are based on estimating probability density functions (PDFs) for each muscle category are implemented. Characterization probabilities of each motor unit potential train (MUPT) are obtained from these PDFs and then Bayes rule is used to aggregate the MUPT characterization probabilities to calculate muscle level probabilities. Even though this technique is not as transparent as PD, its accuracy is higher than the discrete PD. Ultimately, the goal is to use a technique that is based on both PDFs and PD and make it as transparent and as efficient as possible, but first it was necessary to thoroughly assess how accurate a fully continuous approach can be. Using gaussian PDF estimation achieved improvements in muscle categorization accuracy over PD and further improvements resulted from using feature value histograms to choose more representative PDFs; for instance, using log-normal distribution to represent skewed histograms.

  4. In-residence, multiple route exposures to chlorpyrifos and diazinon estimated by indirect method models

    NASA Astrophysics Data System (ADS)

    Moschandreas, D. J.; Kim, Y.; Karuchit, S.; Ari, H.; Lebowitz, M. D.; O'Rourke, M. K.; Gordon, S.; Robertson, G.

    One of the objectives of the National Human Exposure Assessment Survey (NHEXAS) is to estimate exposures to several pollutants in multiple media and determine their distributions for the population of Arizona. This paper presents modeling methods used to estimate exposure distributions of chlorpyrifos and diazinon in the residential microenvironment using the database generated in Arizona (NHEXAS-AZ). A four-stage probability sampling design was used for sample selection. Exposures to pesticides were estimated using the indirect method of exposure calculation by combining measured concentrations of the two pesticides in multiple media with questionnaire information such as time subjects spent indoors, dietary and non-dietary items they consumed, and areas they touched. Most distributions of in-residence exposure to chlorpyrifos and diazinon were log-normal or nearly log-normal. Exposures to chlorpyrifos and diazinon vary by pesticide and route as well as by various demographic characteristics of the subjects. Comparisons of exposure to pesticides were investigated among subgroups of demographic categories, including gender, age, minority status, education, family income, household dwelling type, year the dwelling was built, pesticide use, and carpeted areas within dwellings. Residents with large carpeted areas within their dwellings have higher exposures to both pesticides for all routes than those in less carpet-covered areas. Depending on the route, several other determinants of exposure to pesticides were identified, but a clear pattern could not be established regarding the exposure differences between several subpopulation groups.

  5. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable.

    PubMed

    Austin, Peter C; Steyerberg, Ewout W

    2012-06-20

    When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.

  6. Statistical Significance of Periodicity and Log-Periodicity with Heavy-Tailed Correlated Noise

    NASA Astrophysics Data System (ADS)

    Zhou, Wei-Xing; Sornette, Didier

    We estimate the probability that random noise, of several plausible standard distributions, creates a false alarm that a periodicity (or log-periodicity) is found in a time series. The solution of this problem is already known for independent Gaussian distributed noise. We investigate more general situations with non-Gaussian correlated noises and present synthetic tests on the detectability and statistical significance of periodic components. A periodic component of a time series is usually detected by some sort of Fourier analysis. Here, we use the Lomb periodogram analysis, which is suitable and outperforms Fourier transforms for unevenly sampled time series. We examine the false-alarm probability of the largest spectral peak of the Lomb periodogram in the presence of power-law distributed noises, of short-range and of long-range fractional-Gaussian noises. Increasing heavy-tailness (respectively correlations describing persistence) tends to decrease (respectively increase) the false-alarm probability of finding a large spurious Lomb peak. Increasing anti-persistence tends to decrease the false-alarm probability. We also study the interplay between heavy-tailness and long-range correlations. In order to fully determine if a Lomb peak signals a genuine rather than a spurious periodicity, one should in principle characterize the Lomb peak height, its width and its relations to other peaks in the complete spectrum. As a step towards this full characterization, we construct the joint-distribution of the frequency position (relative to other peaks) and of the height of the highest peak of the power spectrum. We also provide the distributions of the ratio of the highest Lomb peak to the second highest one. Using the insight obtained by the present statistical study, we re-examine previously reported claims of ``log-periodicity'' and find that the credibility for log-periodicity in 2D-freely decaying turbulence is weakened while it is strengthened for fracture, for the ion-signature prior to the Kobe earthquake and for financial markets.

  7. On fitting the Pareto Levy distribution to stock market index data: Selecting a suitable cutoff value

    NASA Astrophysics Data System (ADS)

    Coronel-Brizio, H. F.; Hernández-Montoya, A. R.

    2005-08-01

    The so-called Pareto-Levy or power-law distribution has been successfully used as a model to describe probabilities associated to extreme variations of stock markets indexes worldwide. The selection of the threshold parameter from empirical data and consequently, the determination of the exponent of the distribution, is often done using a simple graphical method based on a log-log scale, where a power-law probability plot shows a straight line with slope equal to the exponent of the power-law distribution. This procedure can be considered subjective, particularly with regard to the choice of the threshold or cutoff parameter. In this work, a more objective procedure based on a statistical measure of discrepancy between the empirical and the Pareto-Levy distribution is presented. The technique is illustrated for data sets from the New York Stock Exchange (DJIA) and the Mexican Stock Market (IPC).

  8. Assessing cadmium exposure risks of vegetables with plant uptake factor and soil property.

    PubMed

    Yang, Yang; Chang, Andrew C; Wang, Meie; Chen, Weiping; Peng, Chi

    2018-07-01

    Plant uptake factors (PUFs) are of great importance in human cadmium (Cd) exposure risk assessment while it has been often treated in a generic way. We collected 1077 pairs of vegetable-soil samples from production fields to characterize Cd PUFs and demonstrated their utility in assessing Cd exposure risks to consumers of locally grown vegetables. The Cd PUFs varied with plant species and pH and organic matter content of soils. Once normalized PUFs against soil parameters, the PUFs distributions were log-normal in nature. In this manner, the PUFs were represented by definable probability distributions instead of a deterministic figure. The Cd exposure risks were then assessed using the normalized PUF based on the Monte Carlo simulation algorithm. Factors affecting the extent of Cd exposures were isolated through sensitivity analyses. Normalized PUF would illustrate the outcomes for uncontaminated and slightly contaminated soils. Among the vegetables, lettuce was potentially hazardous for residents due to its high Cd accumulation but low Zn concentration. To protect 95% of the lettuce production from causing excessive Cd exposure risks, pH of soils needed to be 5.9 and above. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Log-amplitude statistics for Beck-Cohen superstatistics

    NASA Astrophysics Data System (ADS)

    Kiyono, Ken; Konno, Hidetoshi

    2013-05-01

    As a possible generalization of Beck-Cohen superstatistical processes, we study non-Gaussian processes with temporal heterogeneity of local variance. To characterize the variance heterogeneity, we define log-amplitude cumulants and log-amplitude autocovariance and derive closed-form expressions of the log-amplitude cumulants for χ2, inverse χ2, and log-normal superstatistical distributions. Furthermore, we show that χ2 and inverse χ2 superstatistics with degree 2 are closely related to an extreme value distribution, called the Gumbel distribution. In these cases, the corresponding superstatistical distributions result in the q-Gaussian distribution with q=5/3 and the bilateral exponential distribution, respectively. Thus, our finding provides a hypothesis that the asymptotic appearance of these two special distributions may be explained by a link with the asymptotic limit distributions involving extreme values. In addition, as an application of our approach, we demonstrated that non-Gaussian fluctuations observed in a stock index futures market can be well approximated by the χ2 superstatistical distribution with degree 2.

  10. Fluctuations in email size

    NASA Astrophysics Data System (ADS)

    Matsubara, Yoshitsugu; Musashi, Yasuo

    2017-12-01

    The purpose of this study is to explain fluctuations in email size. We have previously investigated the long-term correlations between email send requests and data flow in the system log of the primary staff email server at a university campus, finding that email size frequency follows a power-law distribution with two inflection points, and that the power-law property weakens the correlation of the data flow. However, the mechanism underlying this fluctuation is not completely understood. We collected new log data from both staff and students over six academic years and analyzed the frequency distribution thereof, focusing on the type of content contained in the emails. Furthermore, we obtained permission to collect "Content-Type" log data from the email headers. We therefore collected the staff log data from May 1, 2015 to July 31, 2015, creating two subdistributions. In this paper, we propose a model to explain these subdistributions, which follow log-normal-like distributions. In the log-normal-like model, email senders -consciously or unconsciously- regulate the size of new email sentences according to a normal distribution. The fitting of the model is acceptable for these subdistributions, and the model demonstrates power-law properties for large email sizes. An analysis of the length of new email sentences would be required for further discussion of our model; however, to protect user privacy at the participating organization, we left this analysis for future work. This study provides new knowledge on the properties of email sizes, and our model is expected to contribute to the decision on whether to establish upper size limits in the design of email services.

  11. Stochastic modelling of non-stationary financial assets

    NASA Astrophysics Data System (ADS)

    Estevens, Joana; Rocha, Paulo; Boto, João P.; Lind, Pedro G.

    2017-11-01

    We model non-stationary volume-price distributions with a log-normal distribution and collect the time series of its two parameters. The time series of the two parameters are shown to be stationary and Markov-like and consequently can be modelled with Langevin equations, which are derived directly from their series of values. Having the evolution equations of the log-normal parameters, we reconstruct the statistics of the first moments of volume-price distributions which fit well the empirical data. Finally, the proposed framework is general enough to study other non-stationary stochastic variables in other research fields, namely, biology, medicine, and geology.

  12. Relationship between the column density distribution and evolutionary class of molecular clouds as viewed by ATLASGAL

    NASA Astrophysics Data System (ADS)

    Abreu-Vicente, J.; Kainulainen, J.; Stutz, A.; Henning, Th.; Beuther, H.

    2015-09-01

    We present the first study of the relationship between the column density distribution of molecular clouds within nearby Galactic spiral arms and their evolutionary status as measured from their stellar content. We analyze a sample of 195 molecular clouds located at distances below 5.5 kpc, identified from the ATLASGAL 870 μm data. We define three evolutionary classes within this sample: starless clumps, star-forming clouds with associated young stellar objects, and clouds associated with H ii regions. We find that the N(H2) probability density functions (N-PDFs) of these three classes of objects are clearly different: the N-PDFs of starless clumps are narrowest and close to log-normal in shape, while star-forming clouds and H ii regions exhibit a power-law shape over a wide range of column densities and log-normal-like components only at low column densities. We use the N-PDFs to estimate the evolutionary time-scales of the three classes of objects based on a simple analytic model from literature. Finally, we show that the integral of the N-PDFs, the dense gas mass fraction, depends on the total mass of the regions as measured by ATLASGAL: more massive clouds contain greater relative amounts of dense gas across all evolutionary classes. Appendices are available in electronic form at http://www.aanda.org

  13. Uncertainty analysis of the radiological characteristics of radioactive waste using a method based on log-normal distributions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gigase, Yves

    2007-07-01

    Available in abstract form only. Full text of publication follows: The uncertainty on characteristics of radioactive LILW waste packages is difficult to determine and often very large. This results from a lack of knowledge of the constitution of the waste package and of the composition of the radioactive sources inside. To calculate a quantitative estimate of the uncertainty on a characteristic of a waste package one has to combine these various uncertainties. This paper discusses an approach to this problem, based on the use of the log-normal distribution, which is both elegant and easy to use. It can provide asmore » example quantitative estimates of uncertainty intervals that 'make sense'. The purpose is to develop a pragmatic approach that can be integrated into existing characterization methods. In this paper we show how our method can be applied to the scaling factor method. We also explain how it can be used when estimating other more complex characteristics such as the total uncertainty of a collection of waste packages. This method could have applications in radioactive waste management, more in particular in those decision processes where the uncertainty on the amount of activity is considered to be important such as in probability risk assessment or the definition of criteria for acceptance or categorization. (author)« less

  14. Methodological uncertainty in quantitative prediction of human hepatic clearance from in vitro experimental systems.

    PubMed

    Hallifax, D; Houston, J B

    2009-03-01

    Mechanistic prediction of unbound drug clearance from human hepatic microsomes and hepatocytes correlates with in vivo clearance but is both systematically low (10 - 20 % of in vivo clearance) and highly variable, based on detailed assessments of published studies. Metabolic capacity (Vmax) of commercially available human hepatic microsomes and cryopreserved hepatocytes is log-normally distributed within wide (30 - 150-fold) ranges; Km is also log-normally distributed and effectively independent of Vmax, implying considerable variability in intrinsic clearance. Despite wide overlap, average capacity is 2 - 20-fold (dependent on P450 enzyme) greater in microsomes than hepatocytes, when both are normalised (scaled to whole liver). The in vitro ranges contrast with relatively narrow ranges of clearance among clinical studies. The high in vitro variation probably reflects unresolved phenotypical variability among liver donors and practicalities in processing of human liver into in vitro systems. A significant contribution from the latter is supported by evidence of low reproducibility (several fold) of activity in cryopreserved hepatocytes and microsomes prepared from the same cells, between separate occasions of thawing of cells from the same liver. The large uncertainty which exists in human hepatic in vitro systems appears to dominate the overall uncertainty of in vitro-in vivo extrapolation, including uncertainties within scaling, modelling and drug dependent effects. As such, any notion of quantitative prediction of clearance appears severely challenged.

  15. Constructing inverse probability weights for continuous exposures: a comparison of methods.

    PubMed

    Naimi, Ashley I; Moodie, Erica E M; Auger, Nathalie; Kaufman, Jay S

    2014-03-01

    Inverse probability-weighted marginal structural models with binary exposures are common in epidemiology. Constructing inverse probability weights for a continuous exposure can be complicated by the presence of outliers, and the need to identify a parametric form for the exposure and account for nonconstant exposure variance. We explored the performance of various methods to construct inverse probability weights for continuous exposures using Monte Carlo simulation. We generated two continuous exposures and binary outcomes using data sampled from a large empirical cohort. The first exposure followed a normal distribution with homoscedastic variance. The second exposure followed a contaminated Poisson distribution, with heteroscedastic variance equal to the conditional mean. We assessed six methods to construct inverse probability weights using: a normal distribution, a normal distribution with heteroscedastic variance, a truncated normal distribution with heteroscedastic variance, a gamma distribution, a t distribution (1, 3, and 5 degrees of freedom), and a quantile binning approach (based on 10, 15, and 20 exposure categories). We estimated the marginal odds ratio for a single-unit increase in each simulated exposure in a regression model weighted by the inverse probability weights constructed using each approach, and then computed the bias and mean squared error for each method. For the homoscedastic exposure, the standard normal, gamma, and quantile binning approaches performed best. For the heteroscedastic exposure, the quantile binning, gamma, and heteroscedastic normal approaches performed best. Our results suggest that the quantile binning approach is a simple and versatile way to construct inverse probability weights for continuous exposures.

  16. Using type IV Pearson distribution to calculate the probabilities of underrun and overrun of lists of multiple cases.

    PubMed

    Wang, Jihan; Yang, Kai

    2014-07-01

    An efficient operating room needs both little underutilised and overutilised time to achieve optimal cost efficiency. The probabilities of underrun and overrun of lists of cases can be estimated by a well defined duration distribution of the lists. To propose a method of predicting the probabilities of underrun and overrun of lists of cases using Type IV Pearson distribution to support case scheduling. Six years of data were collected. The first 5 years of data were used to fit distributions and estimate parameters. The data from the last year were used as testing data to validate the proposed methods. The percentiles of the duration distribution of lists of cases were calculated by Type IV Pearson distribution and t-distribution. Monte Carlo simulation was conducted to verify the accuracy of percentiles defined by the proposed methods. Operating rooms in John D. Dingell VA Medical Center, United States, from January 2005 to December 2011. Differences between the proportion of lists of cases that were completed within the percentiles of the proposed duration distribution of the lists and the corresponding percentiles. Compared with the t-distribution, the proposed new distribution is 8.31% (0.38) more accurate on average and 14.16% (0.19) more accurate in calculating the probabilities at the 10th and 90th percentiles of the distribution, which is a major concern of operating room schedulers. The absolute deviations between the percentiles defined by Type IV Pearson distribution and those from Monte Carlo simulation varied from 0.20  min (0.01) to 0.43  min (0.03). Operating room schedulers can rely on the most recent 10 cases with the same combination of surgeon and procedure(s) for distribution parameter estimation to plan lists of cases. Values are mean (SEM). The proposed Type IV Pearson distribution is more accurate than t-distribution to estimate the probabilities of underrun and overrun of lists of cases. However, as not all the individual case durations followed log-normal distributions, there was some deviation from the true duration distribution of the lists.

  17. Determining prescription durations based on the parametric waiting time distribution.

    PubMed

    Støvring, Henrik; Pottegård, Anton; Hallas, Jesper

    2016-12-01

    The purpose of the study is to develop a method to estimate the duration of single prescriptions in pharmacoepidemiological studies when the single prescription duration is not available. We developed an estimation algorithm based on maximum likelihood estimation of a parametric two-component mixture model for the waiting time distribution (WTD). The distribution component for prevalent users estimates the forward recurrence density (FRD), which is related to the distribution of time between subsequent prescription redemptions, the inter-arrival density (IAD), for users in continued treatment. We exploited this to estimate percentiles of the IAD by inversion of the estimated FRD and defined the duration of a prescription as the time within which 80% of current users will have presented themselves again. Statistical properties were examined in simulation studies, and the method was applied to empirical data for four model drugs: non-steroidal anti-inflammatory drugs (NSAIDs), warfarin, bendroflumethiazide, and levothyroxine. Simulation studies found negligible bias when the data-generating model for the IAD coincided with the FRD used in the WTD estimation (Log-Normal). When the IAD consisted of a mixture of two Log-Normal distributions, but was analyzed with a single Log-Normal distribution, relative bias did not exceed 9%. Using a Log-Normal FRD, we estimated prescription durations of 117, 91, 137, and 118 days for NSAIDs, warfarin, bendroflumethiazide, and levothyroxine, respectively. Similar results were found with a Weibull FRD. The algorithm allows valid estimation of single prescription durations, especially when the WTD reliably separates current users from incident users, and may replace ad-hoc decision rules in automated implementations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  18. Technical note: An improved approach to determining background aerosol concentrations with PILS sampling on aircraft

    NASA Astrophysics Data System (ADS)

    Fukami, Christine S.; Sullivan, Amy P.; Ryan Fulgham, S.; Murschell, Trey; Borch, Thomas; Smith, James N.; Farmer, Delphine K.

    2016-07-01

    Particle-into-Liquid Samplers (PILS) have become a standard aerosol collection technique, and are widely used in both ground and aircraft measurements in conjunction with off-line ion chromatography (IC) measurements. Accurate and precise background samples are essential to account for gas-phase components not efficiently removed and any interference in the instrument lines, collection vials or off-line analysis procedures. For aircraft sampling with PILS, backgrounds are typically taken with in-line filters to remove particles prior to sample collection once or twice per flight with more numerous backgrounds taken on the ground. Here, we use data collected during the Front Range Air Pollution and Photochemistry Éxperiment (FRAPPÉ) to demonstrate that not only are multiple background filter samples are essential to attain a representative background, but that the chemical background signals do not follow the Gaussian statistics typically assumed. Instead, the background signals for all chemical components analyzed from 137 background samples (taken from ∼78 total sampling hours over 18 flights) follow a log-normal distribution, meaning that the typical approaches of averaging background samples and/or assuming a Gaussian distribution cause an over-estimation of background samples - and thus an underestimation of sample concentrations. Our approach of deriving backgrounds from the peak of the log-normal distribution results in detection limits of 0.25, 0.32, 3.9, 0.17, 0.75 and 0.57 μg m-3 for sub-micron aerosol nitrate (NO3-), nitrite (NO2-), ammonium (NH4+), sulfate (SO42-), potassium (K+) and calcium (Ca2+), respectively. The difference in backgrounds calculated from assuming a Gaussian distribution versus a log-normal distribution were most extreme for NH4+, resulting in a background that was 1.58× that determined from fitting a log-normal distribution.

  19. Tsunami Size Distributions at Far-Field Locations from Aggregated Earthquake Sources

    NASA Astrophysics Data System (ADS)

    Geist, E. L.; Parsons, T.

    2015-12-01

    The distribution of tsunami amplitudes at far-field tide gauge stations is explained by aggregating the probability of tsunamis derived from individual subduction zones and scaled by their seismic moment. The observed tsunami amplitude distributions of both continental (e.g., San Francisco) and island (e.g., Hilo) stations distant from subduction zones are examined. Although the observed probability distributions nominally follow a Pareto (power-law) distribution, there are significant deviations. Some stations exhibit varying degrees of tapering of the distribution at high amplitudes and, in the case of the Hilo station, there is a prominent break in slope on log-log probability plots. There are also differences in the slopes of the observed distributions among stations that can be significant. To explain these differences we first estimate seismic moment distributions of observed earthquakes for major subduction zones. Second, regression models are developed that relate the tsunami amplitude at a station to seismic moment at a subduction zone, correcting for epicentral distance. The seismic moment distribution is then transformed to a site-specific tsunami amplitude distribution using the regression model. Finally, a mixture distribution is developed, aggregating the transformed tsunami distributions from all relevant subduction zones. This mixture distribution is compared to the observed distribution to assess the performance of the method described above. This method allows us to estimate the largest tsunami that can be expected in a given time period at a station.

  20. Optical and physical properties of stratospheric aerosols from balloon measurements in the visible and near-infrared domains. I. Analysis of aerosol extinction spectra from the AMON and SALOMON balloonborne spectrometers

    NASA Astrophysics Data System (ADS)

    Berthet, Gwenaël; Renard, Jean-Baptiste; Brogniez, Colette; Robert, Claude; Chartier, Michel; Pirre, Michel

    2002-12-01

    Aerosol extinction coefficients have been derived in the 375-700-nm spectral domain from measurements in the stratosphere since 1992, at night, at mid- and high latitudes from 15 to 40 km, by two balloonborne spectrometers, Absorption par les Minoritaires Ozone et NOx (AMON) and Spectroscopie d'Absorption Lunaire pour l'Observation des Minoritaires Ozone et NOx (SALOMON). Log-normal size distributions associated with the Mie-computed extinction spectra that best fit the measurements permit calculation of integrated properties of the distributions. Although measured extinction spectra that correspond to background aerosols can be reproduced by the Mie scattering model by use of monomodal log-normal size distributions, each flight reveals some large discrepancies between measurement and theory at several altitudes. The agreement between measured and Mie-calculated extinction spectra is significantly improved by use of bimodal log-normal distributions. Nevertheless, neither monomodal nor bimodal distributions permit correct reproduction of some of the measured extinction shapes, especially for the 26 February 1997 AMON flight, which exhibited spectral behavior attributed to particles from a polar stratospheric cloud event.

  1. Optimized lower leg injury probability curves from post-mortem human subject tests under axial impacts

    PubMed Central

    Yoganandan, Narayan; Arun, Mike W.J.; Pintar, Frank A.; Szabo, Aniko

    2015-01-01

    Objective Derive optimum injury probability curves to describe human tolerance of the lower leg using parametric survival analysis. Methods The study re-examined lower leg PMHS data from a large group of specimens. Briefly, axial loading experiments were conducted by impacting the plantar surface of the foot. Both injury and non-injury tests were included in the testing process. They were identified by pre- and posttest radiographic images and detailed dissection following the impact test. Fractures included injuries to the calcaneus and distal tibia-fibula complex (including pylon), representing severities at the Abbreviated Injury Score (AIS) level 2+. For the statistical analysis, peak force was chosen as the main explanatory variable and the age was chosen as the co-variable. Censoring statuses depended on experimental outcomes. Parameters from the parametric survival analysis were estimated using the maximum likelihood approach and the dfbetas statistic was used to identify overly influential samples. The best fit from the Weibull, log-normal and log-logistic distributions was based on the Akaike Information Criterion. Plus and minus 95% confidence intervals were obtained for the optimum injury probability distribution. The relative sizes of the interval were determined at predetermined risk levels. Quality indices were described at each of the selected probability levels. Results The mean age, stature and weight: 58.2 ± 15.1 years, 1.74 ± 0.08 m and 74.9 ± 13.8 kg. Excluding all overly influential tests resulted in the tightest confidence intervals. The Weibull distribution was the most optimum function compared to the other two distributions. A majority of quality indices were in the good category for this optimum distribution when results were extracted for 25-, 45- and 65-year-old at five, 25 and 50% risk levels age groups for lower leg fracture. For 25, 45 and 65 years, peak forces were 8.1, 6.5, and 5.1 kN at 5% risk; 9.6, 7.7, and 6.1 kN at 25% risk; and 10.4, 8.3, and 6.6 kN at 50% risk, respectively. Conclusions This study derived axial loading-induced injury risk curves based on survival analysis using peak force and specimen age; adopting different censoring schemes; considering overly influential samples in the analysis; and assessing the quality of the distribution at discrete probability levels. Because procedures used in the present survival analysis are accepted by international automotive communities, current optimum human injury probability distributions can be used at all risk levels with more confidence in future crashworthiness applications for automotive and other disciplines. PMID:25307381

  2. Vector wind and vector wind shear models 0 to 27 km altitude for Cape Kennedy, Florida, and Vandenberg AFB, California

    NASA Technical Reports Server (NTRS)

    Smith, O. E.

    1976-01-01

    The techniques are presented to derive several statistical wind models. The techniques are from the properties of the multivariate normal probability function. Assuming that the winds can be considered as bivariate normally distributed, then (1) the wind components and conditional wind components are univariate normally distributed, (2) the wind speed is Rayleigh distributed, (3) the conditional distribution of wind speed given a wind direction is Rayleigh distributed, and (4) the frequency of wind direction can be derived. All of these distributions are derived from the 5-sample parameter of wind for the bivariate normal distribution. By further assuming that the winds at two altitudes are quadravariate normally distributed, then the vector wind shear is bivariate normally distributed and the modulus of the vector wind shear is Rayleigh distributed. The conditional probability of wind component shears given a wind component is normally distributed. Examples of these and other properties of the multivariate normal probability distribution function as applied to Cape Kennedy, Florida, and Vandenberg AFB, California, wind data samples are given. A technique to develop a synthetic vector wind profile model of interest to aerospace vehicle applications is presented.

  3. Inferring local competition intensity from patch size distributions: a test using biological soil crusts

    USGS Publications Warehouse

    Bowker, Matthew A.; Maestre, Fernando T.

    2012-01-01

    Dryland vegetation is inherently patchy. This patchiness goes on to impact ecology, hydrology, and biogeochemistry. Recently, researchers have proposed that dryland vegetation patch sizes follow a power law which is due to local plant facilitation. It is unknown what patch size distribution prevails when competition predominates over facilitation, or if such a pattern could be used to detect competition. We investigated this question in an alternative vegetation type, mosses and lichens of biological soil crusts, which exhibit a smaller scale patch-interpatch configuration. This micro-vegetation is characterized by competition for space. We proposed that multiplicative effects of genetics, environment and competition should result in a log-normal patch size distribution. When testing the prevalence of log-normal versus power law patch size distributions, we found that the log-normal was the better distribution in 53% of cases and a reasonable fit in 83%. In contrast, the power law was better in 39% of cases, and in 8% of instances both distributions fit equally well. We further hypothesized that the log-normal distribution parameters would be predictably influenced by competition strength. There was qualitative agreement between one of the distribution's parameters (μ) and a novel intransitive (lacking a 'best' competitor) competition index, suggesting that as intransitivity increases, patch sizes decrease. The correlation of μ with other competition indicators based on spatial segregation of species (the C-score) depended on aridity. In less arid sites, μ was negatively correlated with the C-score (suggesting smaller patches under stronger competition), while positive correlations (suggesting larger patches under stronger competition) were observed at more arid sites. We propose that this is due to an increasing prevalence of competition transitivity as aridity increases. These findings broaden the emerging theory surrounding dryland patch size distributions and, with refinement, may help us infer cryptic ecological processes from easily observed spatial patterns in the field.

  4. Assessment of the hygienic performances of hamburger patty production processes.

    PubMed

    Gill, C O; Rahn, K; Sloan, K; McMullen, L M

    1997-05-20

    The hygienic conditions of the hamburger patties collected from three patty manufacturing plants and six retail outlets were examined. At each manufacturing plant a sample from newly formed, chilled patties and one from frozen patties were collected from each of 25 batches of patties selected at random. At three, two or one retail outlet, respectively, 25 samples from frozen, chilled or both frozen and chilled patties were collected at random. Each sample consisted of 30 g of meat obtained from five or six patties. Total aerobic, coliform and Escherichia coli counts per gram were enumerated for each sample. The mean log (x) and standard deviation (s) were calculated for the log10 values for each set of 25 counts, on the assumption that the distribution of counts approximated the log normal. A value for the log10 of the arithmetic mean (log A) was calculated for each set from the values of x and s. A chi2 statistic was calculated for each set as a test of the assumption of the log normal distribution. The chi2 statistic was calculable for 32 of the 39 sets. Four of the sets gave chi2 values indicative of gross deviation from log normality. On inspection of those sets, distributions obviously differing from the log normal were apparent in two. Log A values for total, coliform and E. coli counts for chilled patties from manufacturing plants ranged from 4.4 to 5.1, 1.7 to 2.3 and 0.9 to 1.5, respectively. Log A values for frozen patties from manufacturing plants were between < 0.1 and 0.5 log10 units less than the equivalent values for chilled patties. Log A values for total, coliform and E. coli counts for frozen patties on retail sale ranged from 3.8 to 8.5, < 0.5 to 3.6 and < 0 to 1.9, respectively. The equivalent ranges for chilled patties on retail sale were 4.8 to 8.5, 1.8 to 3.7 and 1.4 to 2.7, respectively. The findings indicate that the general hygienic condition of hamburgers patties could be improved by their being manufactured from only manufacturing beef of superior hygienic quality, and by the better management of chilled patties at retail outlets.

  5. A branching process model for the analysis of abortive colony size distributions in carbon ion-irradiated normal human fibroblasts.

    PubMed

    Sakashita, Tetsuya; Hamada, Nobuyuki; Kawaguchi, Isao; Hara, Takamitsu; Kobayashi, Yasuhiko; Saito, Kimiaki

    2014-05-01

    A single cell can form a colony, and ionizing irradiation has long been known to reduce such a cellular clonogenic potential. Analysis of abortive colonies unable to continue to grow should provide important information on the reproductive cell death (RCD) following irradiation. Our previous analysis with a branching process model showed that the RCD in normal human fibroblasts can persist over 16 generations following irradiation with low linear energy transfer (LET) γ-rays. Here we further set out to evaluate the RCD persistency in abortive colonies arising from normal human fibroblasts exposed to high-LET carbon ions (18.3 MeV/u, 108 keV/µm). We found that the abortive colony size distribution determined by biological experiments follows a linear relationship on the log-log plot, and that the Monte Carlo simulation using the RCD probability estimated from such a linear relationship well simulates the experimentally determined surviving fraction and the relative biological effectiveness (RBE). We identified the short-term phase and long-term phase for the persistent RCD following carbon-ion irradiation, which were similar to those previously identified following γ-irradiation. Taken together, our results suggest that subsequent secondary or tertiary colony formation would be invaluable for understanding the long-lasting RCD. All together, our framework for analysis with a branching process model and a colony formation assay is applicable to determination of cellular responses to low- and high-LET radiation, and suggests that the long-lasting RCD is a pivotal determinant of the surviving fraction and the RBE.

  6. Effects of a primordial magnetic field with log-normal distribution on the cosmic microwave background

    NASA Astrophysics Data System (ADS)

    Yamazaki, Dai G.; Ichiki, Kiyotomo; Takahashi, Keitaro

    2011-12-01

    We study the effect of primordial magnetic fields (PMFs) on the anisotropies of the cosmic microwave background (CMB). We assume the spectrum of PMFs is described by log-normal distribution which has a characteristic scale, rather than power-law spectrum. This scale is expected to reflect the generation mechanisms and our analysis is complementary to previous studies with power-law spectrum. We calculate power spectra of energy density and Lorentz force of the log-normal PMFs, and then calculate CMB temperature and polarization angular power spectra from scalar, vector, and tensor modes of perturbations generated from such PMFs. By comparing these spectra with WMAP7, QUaD, CBI, Boomerang, and ACBAR data sets, we find that the current CMB data set places the strongest constraint at k≃10-2.5Mpc-1 with the upper limit B≲3nG.

  7. Explorations in statistics: the log transformation.

    PubMed

    Curran-Everett, Douglas

    2018-06-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.

  8. Universal Distribution of Litter Decay Rates

    NASA Astrophysics Data System (ADS)

    Forney, D. C.; Rothman, D. H.

    2008-12-01

    Degradation of litter is the result of many physical, chemical and biological processes. The high variability of these processes likely accounts for the progressive slowdown of decay with litter age. This age dependence is commonly thought to result from the superposition of processes with different decay rates k. Here we assume an underlying continuous yet unknown distribution p(k) of decay rates [1]. To seek its form, we analyze the mass-time history of 70 LIDET [2] litter data sets obtained under widely varying conditions. We construct a regularized inversion procedure to find the best fitting distribution p(k) with the least degrees of freedom. We find that the resulting p(k) is universally consistent with a lognormal distribution, i.e.~a Gaussian distribution of log k, characterized by a dataset-dependent mean and variance of log k. This result is supported by a recurring observation that microbial populations on leaves are log-normally distributed [3]. Simple biological processes cause the frequent appearance of the log-normal distribution in ecology [4]. Environmental factors, such as soil nitrate, soil aggregate size, soil hydraulic conductivity, total soil nitrogen, soil denitrification, soil respiration have been all observed to be log-normally distributed [5]. Litter degradation rates depend on many coupled, multiplicative factors, which provides a fundamental basis for the lognormal distribution. Using this insight, we systematically estimated the mean and variance of log k for 512 data sets from the LIDET study. We find the mean strongly correlates with temperature and precipitation, while the variance appears to be uncorrelated with main environmental factors and is thus likely more correlated with chemical composition and/or ecology. Results indicate the possibility that the distribution in rates reflects, at least in part, the distribution of microbial niches. [1] B. P. Boudreau, B.~R. Ruddick, American Journal of Science,291, 507, (1991). [2] M. Harmon, Forest Science Data Bank: TD023 [Database]. LTER Intersite Fine Litter Decomposition Experiment (LIDET): Long-Term Ecological Research, (2007). [3] G.~A. Beattie, S.~E. Lindow, Phytopathology 89, 353 (1999). [4] R.~A. May, Ecology and Evolution of Communities/, A pattern of Species Abundance and Diversity, 81 (1975). [5] T.~B. Parkin, J.~A. Robinson, Advances in Soil Science 20, Analysis of Lognormal Data, 194 (1992).

  9. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable

    PubMed Central

    2012-01-01

    Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998

  10. Analysis of in vitro evolution reveals the underlying distribution of catalytic activity among random sequences.

    PubMed

    Pressman, Abe; Moretti, Janina E; Campbell, Gregory W; Müller, Ulrich F; Chen, Irene A

    2017-08-21

    The emergence of catalytic RNA is believed to have been a key event during the origin of life. Understanding how catalytic activity is distributed across random sequences is fundamental to estimating the probability that catalytic sequences would emerge. Here, we analyze the in vitro evolution of triphosphorylating ribozymes and translate their fitnesses into absolute estimates of catalytic activity for hundreds of ribozyme families. The analysis efficiently identified highly active ribozymes and estimated catalytic activity with good accuracy. The evolutionary dynamics follow Fisher's Fundamental Theorem of Natural Selection and a corollary, permitting retrospective inference of the distribution of fitness and activity in the random sequence pool for the first time. The frequency distribution of rate constants appears to be log-normal, with a surprisingly steep dropoff at higher activity, consistent with a mechanism for the emergence of activity as the product of many independent contributions. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Actual distribution of Cronobacter spp. in industrial batches of powdered infant formula and consequences for performance of sampling strategies.

    PubMed

    Jongenburger, I; Reij, M W; Boer, E P J; Gorris, L G M; Zwietering, M H

    2011-11-15

    The actual spatial distribution of microorganisms within a batch of food influences the results of sampling for microbiological testing when this distribution is non-homogeneous. In the case of pathogens being non-homogeneously distributed, it markedly influences public health risk. This study investigated the spatial distribution of Cronobacter spp. in powdered infant formula (PIF) on industrial batch-scale for both a recalled batch as well a reference batch. Additionally, local spatial occurrence of clusters of Cronobacter cells was assessed, as well as the performance of typical sampling strategies to determine the presence of the microorganisms. The concentration of Cronobacter spp. was assessed in the course of the filling time of each batch, by taking samples of 333 g using the most probable number (MPN) enrichment technique. The occurrence of clusters of Cronobacter spp. cells was investigated by plate counting. From the recalled batch, 415 MPN samples were drawn. The expected heterogeneous distribution of Cronobacter spp. could be quantified from these samples, which showed no detectable level (detection limit of -2.52 log CFU/g) in 58% of samples, whilst in the remainder concentrations were found to be between -2.52 and 2.75 log CFU/g. The estimated average concentration in the recalled batch was -2.78 log CFU/g and a standard deviation of 1.10 log CFU/g. The estimated average concentration in the reference batch was -4.41 log CFU/g, with 99% of the 93 samples being below the detection limit. In the recalled batch, clusters of cells occurred sporadically in 8 out of 2290 samples of 1g taken. The two largest clusters contained 123 (2.09 log CFU/g) and 560 (2.75 log CFU/g) cells. Various sampling strategies were evaluated for the recalled batch. Taking more and smaller samples and keeping the total sampling weight constant, considerably improved the performance of the sampling plans to detect such a type of contaminated batch. Compared to random sampling, stratified random sampling improved the probability to detect the heterogeneous contamination. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. Best Statistical Distribution of flood variables for Johor River in Malaysia

    NASA Astrophysics Data System (ADS)

    Salarpour Goodarzi, M.; Yusop, Z.; Yusof, F.

    2012-12-01

    A complex flood event is always characterized by a few characteristics such as flood peak, flood volume, and flood duration, which might be mutually correlated. This study explored the statistical distribution of peakflow, flood duration and flood volume at Rantau Panjang gauging station on the Johor River in Malaysia. Hourly data were recorded for 45 years. The data were analysed based on water year (July - June). Five distributions namely, Log Normal, Generalize Pareto, Log Pearson, Normal and Generalize Extreme Value (GEV) were used to model the distribution of all the three variables. Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests were used to evaluate the best fit. Goodness-of-fit tests at 5% level of significance indicate that all the models can be used to model the distribution of peakflow, flood duration and flood volume. However, Generalize Pareto distribution is found to be the most suitable model when tested with the Anderson-Darling test and the, Kolmogorov-Smirnov suggested that GEV is the best for peakflow. The result of this research can be used to improve flood frequency analysis. Comparison between Generalized Extreme Value, Generalized Pareto and Log Pearson distributions in the Cumulative Distribution Function of peakflow

  13. Box-Cox transformation of firm size data in statistical analysis

    NASA Astrophysics Data System (ADS)

    Chen, Ting Ting; Takaishi, Tetsuya

    2014-03-01

    Firm size data usually do not show the normality that is often assumed in statistical analysis such as regression analysis. In this study we focus on two firm size data: the number of employees and sale. Those data deviate considerably from a normal distribution. To improve the normality of those data we transform them by the Box-Cox transformation with appropriate parameters. The Box-Cox transformation parameters are determined so that the transformed data best show the kurtosis of a normal distribution. It is found that the two firm size data transformed by the Box-Cox transformation show strong linearity. This indicates that the number of employees and sale have the similar property as a firm size indicator. The Box-Cox parameters obtained for the firm size data are found to be very close to zero. In this case the Box-Cox transformations are approximately a log-transformation. This suggests that the firm size data we used are approximately log-normal distributions.

  14. A Feasibility Study of Expanding the F404 Aircraft Engine Repair Capability at the Aircraft Intermediate Maintenance Department

    DTIC Science & Technology

    1993-06-01

    1 A. OBJECTIVES ............. .... .................. 1 B. HISTORY ................... .................... 2 C...utilization, and any additional manpower requirements at the "selected" AIMD’s. B. HISTORY Until late 1991 both NADEP JAX and NADEP North Island (NORIS...TRIANGULAR OR ALL LOG NORMAL DISTRIBUTIONS FOR SERVICE TIMES AT AIND CECIL FIELD maintenance/ Triangular Log Normal MAZDA Difference Differe•ce Supply

  15. Erosion associated with cable and tractor logging in northwestern California

    Treesearch

    R. M. Rice; P. A. Datzman

    1981-01-01

    Abstract - Erosion and site conditions were measured at 102 logged plots in northwestern California. Erosion averaged 26.8 m 3 /ha. A log-normal distribution was a better fit to the data. The antilog of the mean of the logarithms of erosion was 3.2 m 3 /ha. The Coast District Erosion Hazard Rating was a poor predictor of erosion related to logging. In a new equation...

  16. The emergence of different tail exponents in the distributions of firm size variables

    NASA Astrophysics Data System (ADS)

    Ishikawa, Atushi; Fujimoto, Shouji; Watanabe, Tsutomu; Mizuno, Takayuki

    2013-05-01

    We discuss a mechanism through which inversion symmetry (i.e., invariance of a joint probability density function under the exchange of variables) and Gibrat’s law generate power-law distributions with different tail exponents. Using a dataset of firm size variables, that is, tangible fixed assets K, the number of workers L, and sales Y, we confirm that these variables have power-law tails with different exponents, and that inversion symmetry and Gibrat’s law hold. Based on these findings, we argue that there exists a plane in the three dimensional space (logK,logL,logY), with respect to which the joint probability density function for the three variables is invariant under the exchange of variables. We provide empirical evidence suggesting that this plane fits the data well, and argue that the plane can be interpreted as the Cobb-Douglas production function, which has been extensively used in various areas of economics since it was first introduced almost a century ago.

  17. Statistical characterization of thermal plumes in turbulent thermal convection

    NASA Astrophysics Data System (ADS)

    Zhou, Sheng-Qi; Xie, Yi-Chao; Sun, Chao; Xia, Ke-Qing

    2016-09-01

    We report an experimental study on the statistical properties of the thermal plumes in turbulent thermal convection. A method has been proposed to extract the basic characteristics of thermal plumes from temporal temperature measurement inside the convection cell. It has been found that both plume amplitude A and cap width w , in a time domain, are approximately in the log-normal distribution. In particular, the normalized most probable front width is found to be a characteristic scale of thermal plumes, which is much larger than the thermal boundary layer thickness. Over a wide range of the Rayleigh number, the statistical characterizations of the thermal fluctuations of plumes, and the turbulent background, the plume front width and plume spacing have been discussed and compared with the theoretical predictions and morphological observations. For the most part good agreements have been found with the direct observations.

  18. Analysis of porosity distribution of large-scale porous media and their reconstruction by Langevin equation.

    PubMed

    Jafari, G Reza; Sahimi, Muhammad; Rasaei, M Reza; Tabar, M Reza Rahimi

    2011-02-01

    Several methods have been developed in the past for analyzing the porosity and other types of well logs for large-scale porous media, such as oil reservoirs, as well as their permeability distributions. We developed a method for analyzing the porosity logs ϕ(h) (where h is the depth) and similar data that are often nonstationary stochastic series. In this method one first generates a new stationary series based on the original data, and then analyzes the resulting series. It is shown that the series based on the successive increments of the log y(h)=ϕ(h+δh)-ϕ(h) is a stationary and Markov process, characterized by a Markov length scale h(M). The coefficients of the Kramers-Moyal expansion for the conditional probability density function (PDF) P(y,h|y(0),h(0)) are then computed. The resulting PDFs satisfy a Fokker-Planck (FP) equation, which is equivalent to a Langevin equation for y(h) that provides probabilistic predictions for the porosity logs. We also show that the Hurst exponent H of the self-affine distributions, which have been used in the past to describe the porosity logs, is directly linked to the drift and diffusion coefficients that we compute for the FP equation. Also computed are the level-crossing probabilities that provide insight into identifying the high or low values of the porosity beyond the depth interval in which the data have been measured. ©2011 American Physical Society

  19. Double stars with wide separations in the AGK3 - II. The wide binaries and the multiple systems*

    NASA Astrophysics Data System (ADS)

    Halbwachs, J.-L.; Mayor, M.; Udry, S.

    2017-02-01

    A large observation programme was carried out to measure the radial velocities of the components of a selection of common proper motion (CPM) stars to select the physical binaries. 80 wide binaries (WBs) were detected, and 39 optical pairs were identified. By adding CPM stars with separations close enough to be almost certain that they are physical, a bias-controlled sample of 116 WBs was obtained, and used to derive the distribution of separations from 100 to 30 000 au. The distribution obtained does not match the log-constant distribution, but agrees with the log-normal distribution. The spectroscopic binaries detected among the WB components were used to derive statistical information about the multiple systems. The close binaries in WBs seem to be like those detected in other field stars. As for the WBs, they seem to obey the log-normal distribution of periods. The number of quadruple systems agrees with the no correlation hypothesis; this indicates that an environment conducive to the formation of WBs does not favour the formation of subsystems with periods shorter than 10 yr.

  20. An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions

    NASA Technical Reports Server (NTRS)

    Peters, B. C., Jr.; Walker, H. F.

    1975-01-01

    A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.

  1. Proposal of a method for evaluating tsunami risk using response-surface methodology

    NASA Astrophysics Data System (ADS)

    Fukutani, Y.

    2017-12-01

    Information on probabilistic tsunami inundation hazards is needed to define and evaluate tsunami risk. Several methods for calculating these hazards have been proposed (e.g. Løvholt et al. (2012), Thio (2012), Fukutani et al. (2014), Goda et al. (2015)). However, these methods are inefficient, and their calculation cost is high, since they require multiple tsunami numerical simulations, therefore lacking versatility. In this study, we proposed a simpler method for tsunami risk evaluation using response-surface methodology. Kotani et al. (2016) proposed an evaluation method for the probabilistic distribution of tsunami wave-height using a response-surface methodology. We expanded their study and developed a probabilistic distribution of tsunami inundation depth. We set the depth (x1) and the slip (x2) of an earthquake fault as explanatory variables and tsunami inundation depth (y) as an object variable. Subsequently, tsunami risk could be evaluated by conducting a Monte Carlo simulation, assuming that the generation probability of an earthquake follows a Poisson distribution, the probability distribution of tsunami inundation depth follows the distribution derived from a response-surface, and the damage probability of a target follows a log normal distribution. We applied the proposed method to a wood building located on the coast of Tokyo Bay. We implemented a regression analysis based on the results of 25 tsunami numerical calculations and developed a response-surface, which was defined as y=ax1+bx2+c (a:0.2615, b:3.1763, c=-1.1802). We assumed proper probabilistic distribution for earthquake generation, inundation height, and vulnerability. Based on these probabilistic distributions, we conducted Monte Carlo simulations of 1,000,000 years. We clarified that the expected damage probability of the studied wood building is 22.5%, assuming that an earthquake occurs. The proposed method is therefore a useful and simple way to evaluate tsunami risk using a response-surface and Monte Carlo simulation without conducting multiple tsunami numerical simulations.

  2. Mixture EMOS model for calibrating ensemble forecasts of wind speed.

    PubMed

    Baran, S; Lerch, S

    2016-03-01

    Ensemble model output statistics (EMOS) is a statistical tool for post-processing forecast ensembles of weather variables obtained from multiple runs of numerical weather prediction models in order to produce calibrated predictive probability density functions. The EMOS predictive probability density function is given by a parametric distribution with parameters depending on the ensemble forecasts. We propose an EMOS model for calibrating wind speed forecasts based on weighted mixtures of truncated normal (TN) and log-normal (LN) distributions where model parameters and component weights are estimated by optimizing the values of proper scoring rules over a rolling training period. The new model is tested on wind speed forecasts of the 50 member European Centre for Medium-range Weather Forecasts ensemble, the 11 member Aire Limitée Adaptation dynamique Développement International-Hungary Ensemble Prediction System ensemble of the Hungarian Meteorological Service, and the eight-member University of Washington mesoscale ensemble, and its predictive performance is compared with that of various benchmark EMOS models based on single parametric families and combinations thereof. The results indicate improved calibration of probabilistic and accuracy of point forecasts in comparison with the raw ensemble and climatological forecasts. The mixture EMOS model significantly outperforms the TN and LN EMOS methods; moreover, it provides better calibrated forecasts than the TN-LN combination model and offers an increased flexibility while avoiding covariate selection problems. © 2016 The Authors Environmetrics Published by JohnWiley & Sons Ltd.

  3. A hybrid probabilistic/spectral model of scalar mixing

    NASA Astrophysics Data System (ADS)

    Vaithianathan, T.; Collins, Lance

    2002-11-01

    In the probability density function (PDF) description of a turbulent reacting flow, the local temperature and species concentration are replaced by a high-dimensional joint probability that describes the distribution of states in the fluid. The PDF has the great advantage of rendering the chemical reaction source terms closed, independent of their complexity. However, molecular mixing, which involves two-point information, must be modeled. Indeed, the qualitative shape of the PDF is sensitive to this modeling, hence the reliability of the model to predict even the closed chemical source terms rests heavily on the mixing model. We will present a new closure to the mixing based on a spectral representation of the scalar field. The model is implemented as an ensemble of stochastic particles, each carrying scalar concentrations at different wavenumbers. Scalar exchanges within a given particle represent ``transfer'' while scalar exchanges between particles represent ``mixing.'' The equations governing the scalar concentrations at each wavenumber are derived from the eddy damped quasi-normal Markovian (or EDQNM) theory. The model correctly predicts the evolution of an initial double delta function PDF into a Gaussian as seen in the numerical study by Eswaran & Pope (1988). Furthermore, the model predicts the scalar gradient distribution (which is available in this representation) approaches log normal at long times. Comparisons of the model with data derived from direct numerical simulations will be shown.

  4. Evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets. II. Group comparisons

    USGS Publications Warehouse

    Antweiler, Ronald C.

    2015-01-01

    The main classes of statistical treatments that have been used to determine if two groups of censored environmental data arise from the same distribution are substitution methods, maximum likelihood (MLE) techniques, and nonparametric methods. These treatments along with using all instrument-generated data (IN), even those less than the detection limit, were evaluated by examining 550 data sets in which the true values of the censored data were known, and therefore “true” probabilities could be calculated and used as a yardstick for comparison. It was found that technique “quality” was strongly dependent on the degree of censoring present in the groups. For low degrees of censoring (<25% in each group), the Generalized Wilcoxon (GW) technique and substitution of √2/2 times the detection limit gave overall the best results. For moderate degrees of censoring, MLE worked best, but only if the distribution could be estimated to be normal or log-normal prior to its application; otherwise, GW was a suitable alternative. For higher degrees of censoring (each group >40% censoring), no technique provided reliable estimates of the true probability. Group size did not appear to influence the quality of the result, and no technique appeared to become better or worse than other techniques relative to group size. Finally, IN appeared to do very well relative to the other techniques regardless of censoring or group size.

  5. Distribution of runup heights of the December 26, 2004 tsunami in the Indian Ocean

    NASA Astrophysics Data System (ADS)

    Choi, Byung Ho; Hong, Sung Jin; Pelinovsky, Efim

    2006-07-01

    A massive earthquake with magnitude 9.3 occurred on December 26, 2004 off the northern Sumatra generated huge tsunami waves affected many coastal countries in the Indian Ocean. A number of field surveys have been performed after this tsunami event; in particular, several surveys in the south/east coast of India, Andaman and Nicobar Islands, Sri Lanka, Sumatra, Malaysia, and Thailand have been organized by the Korean Society of Coastal and Ocean Engineers from January to August 2005. Spatial distribution of the tsunami runup is used to analyze the distribution function of the wave heights on different coasts. Theoretical interpretation of this distribution is associated with random coastal bathymetry and coastline led to the log-normal functions. Observed data also are in a very good agreement with log-normal distribution confirming the important role of the variable ocean bathymetry in the formation of the irregular wave height distribution along the coasts.

  6. An estimate of field size distributions for selected sites in the major grain producing countries

    NASA Technical Reports Server (NTRS)

    Podwysocki, M. H.

    1977-01-01

    The field size distributions for the major grain producing countries of the World were estimated. LANDSAT-1 and 2 images were evaluated for two areas each in the United States, People's Republic of China, and the USSR. One scene each was evaluated for France, Canada, and India. Grid sampling was done for representative sub-samples of each image, measuring the long and short axes of each field; area was then calculated. Each of the resulting data sets was computer analyzed for their frequency distributions. Nearly all frequency distributions were highly peaked and skewed (shifted) towards small values, approaching that of either a Poisson or log-normal distribution. The data were normalized by a log transformation, creating a Gaussian distribution which has moments readily interpretable and useful for estimating the total population of fields. Resultant predictors of the field size estimates are discussed.

  7. The missing impact craters on Venus

    NASA Technical Reports Server (NTRS)

    Speidel, D. H.

    1993-01-01

    The size-frequency pattern of the 842 impact craters on Venus measured to date can be well described (across four standard deviation units) as a single log normal distribution with a mean crater diameter of 14.5 km. This result was predicted in 1991 on examination of the initial Magellan analysis. If this observed distribution is close to the real distribution, the 'missing' 90 percent of the small craters and the 'anomalous' lack of surface splotches may thus be neither missing nor anomalous. I think that the missing craters and missing splotches can be satisfactorily explained by accepting that the observed distribution approximates the real one, that it is not craters that are missing but the impactors. What you see is what you got. The implication that Venus crossing impactors would have the same type of log normal distribution is consistent with recently described distribution for terrestrial craters and Earth crossing asteroids.

  8. CYP1A2 in a smoking and a non-smoking population; correlation of urinary and salivary phenotypic ratios.

    PubMed

    Woolridge, Helen; Williams, John; Cronin, Anna; Evans, Nicola; Steventon, Glyn B

    2004-01-01

    The use of caffeine as a probe for CYP1A2 phenotyping has been extensively investigated over the last 25 years. Numerous metabolic ratios have been employed and various biological fluids analysed for caffeine and its metabolites. These investigations have used non-smoking, smoking and numerous disease populations to investigate the role of CYP1A2 in possible disease aetiology and for induction and inhibition studies in vivo using dietary, environmental and pharmaceutical compounds. This investigation found that the 17X/137X CYP1A2 metabolic ratio in a 5 h saliva sample and 0-5 h urine collection was not normally distributed in both a non-smoking and a smoking population. The urinary and salivary CYP1A2 metabolic ratio was log normally distributed in the non-smoking population but the smoking population showed a bi- (or tri-)modal distribution on log transformation of both the urinary and salivary CYP1A2 metabolic ratios. The CYP1A2 metabolic ratios were significantly higher in the smoking population compared to the non-smoking population when both the urinary and salivary CYP1A2 metabolic ratios were analysed. These results indicate that urinary flow rate was not a factor in the variation in CYP1A2 phenotype in the non-smoking and smoking populations studied here. The increased CYP1A2 activity in the smoking population was probably due to induction of the CYP1A2 gene via the Ah receptor causing an increase in the concentration of CYP1A2 protein.

  9. A New Closed Form Approximation for BER for Optical Wireless Systems in Weak Atmospheric Turbulence

    NASA Astrophysics Data System (ADS)

    Kaushik, Rahul; Khandelwal, Vineet; Jain, R. C.

    2018-04-01

    Weak atmospheric turbulence condition in an optical wireless communication (OWC) is captured by log-normal distribution. The analytical evaluation of average bit error rate (BER) of an OWC system under weak turbulence is intractable as it involves the statistical averaging of Gaussian Q-function over log-normal distribution. In this paper, a simple closed form approximation for BER of OWC system under weak turbulence is given. Computation of BER for various modulation schemes is carried out using proposed expression. The results obtained using proposed expression compare favorably with those obtained using Gauss-Hermite quadrature approximation and Monte Carlo Simulations.

  10. A log-sinh transformation for data normalization and variance stabilization

    NASA Astrophysics Data System (ADS)

    Wang, Q. J.; Shrestha, D. L.; Robertson, D. E.; Pokhrel, P.

    2012-05-01

    When quantifying model prediction uncertainty, it is statistically convenient to represent model errors that are normally distributed with a constant variance. The Box-Cox transformation is the most widely used technique to normalize data and stabilize variance, but it is not without limitations. In this paper, a log-sinh transformation is derived based on a pattern of errors commonly seen in hydrological model predictions. It is suited to applications where prediction variables are positively skewed and the spread of errors is seen to first increase rapidly, then slowly, and eventually approach a constant as the prediction variable becomes greater. The log-sinh transformation is applied in two case studies, and the results are compared with one- and two-parameter Box-Cox transformations.

  11. The Adaptation of the Moth Pheromone Receptor Neuron to its Natural Stimulus

    NASA Astrophysics Data System (ADS)

    Kostal, Lubomir; Lansky, Petr; Rospars, Jean-Pierre

    2008-07-01

    We analyze the first phase of information transduction in the model of the olfactory receptor neuron of the male moth Antheraea polyphemus. We predict such stimulus characteristics that enable the system to perform optimally, i.e., to transfer as much information as possible. Few a priori constraints on the nature of stimulus and stimulus-to-signal transduction are assumed. The results are given in terms of stimulus distributions and intermittency factors which makes direct comparison with experimental data possible. Optimal stimulus is approximatelly described by exponential or log-normal probability density function which is in agreement with experiment and the predicted intermittency factors fall within the lowest range of observed values. The results are discussed with respect to electroantennogram measurements and behavioral observations.

  12. First Test of Stochastic Growth Theory for Langmuir Waves in Earth's Foreshock

    NASA Technical Reports Server (NTRS)

    Cairns, Iver H.; Robinson, P. A.

    1997-01-01

    This paper presents the first test of whether stochastic growth theory (SGT) can explain the detailed characteristics of Langmuir-like waves in Earth's foreshock. A period with unusually constant solar wind magnetic field is analyzed. The observed distributions P(logE) of wave fields E for two intervals with relatively constant spacecraft location (DIFF) are shown to agree well with the fundamental prediction of SGT, that P(logE) is Gaussian in log E. This stochastic growth can be accounted for semi-quantitatively in terms of standard foreshock beam parameters and a model developed for interplanetary type III bursts. Averaged over the entire period with large variations in DIFF, the P(logE) distribution is a power-law with index approximately -1; this is interpreted in terms of convolution of intrinsic, spatially varying P(logE) distributions with a probability function describing ISEE's residence time at a given DIFF. Wave data from this interval thus provide good observational evidence that SGT can sometimes explain the clumping, burstiness, persistence, and highly variable fields of the foreshock Langmuir-like waves.

  13. First test of stochastic growth theory for Langmuir waves in Earth's foreshock

    NASA Astrophysics Data System (ADS)

    Cairns, Iver H.; Robinson, P. A.

    This paper presents the first test of whether stochastic growth theory (SGT) can explain the detailed characteristics of Langmuir-like waves in Earth's foreshock. A period with unusually constant solar wind magnetic field is analyzed. The observed distributions P(log E) of wave fields E for two intervals with relatively constant spacecraft location (DIFF) are shown to agree well with the fundamental prediction of SGT, that P(log E) is Gaussian in log E. This stochastic growth can be accounted for semi-quantitatively in terms of standard foreshock beam parameters and a model developed for interplanetary type III bursts. Averaged over the entire period with large variations in DIFF, the P(log E) distribution is a power-law with index ˜ -1 this is interpreted in terms of convolution of intrinsic, spatially varying P(log E) distributions with a probability function describing ISEE's residence time at a given DIFF. Wave data from this interval thus provide good observational evidence that SGT can sometimes explain the clumping, burstiness, persistence, and highly variable fields of the foreshock Langmuir-like waves.

  14. Properties of the probability density function of the non-central chi-squared distribution

    NASA Astrophysics Data System (ADS)

    András, Szilárd; Baricz, Árpád

    2008-10-01

    In this paper we consider the probability density function (pdf) of a non-central [chi]2 distribution with arbitrary number of degrees of freedom. For this function we prove that can be represented as a finite sum and we deduce a partial derivative formula. Moreover, we show that the pdf is log-concave when the degrees of freedom is greater or equal than 2. At the end of this paper we present some Turán-type inequalities for this function and an elegant application of the monotone form of l'Hospital's rule in probability theory is given.

  15. Statistical characteristics of the spatial distribution of territorial contamination by radionuclides from the Chernobyl accident

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arutyunyan, R.V.; Bol`shov, L.A.; Vasil`ev, S.K.

    1994-06-01

    The objective of this study was to clarify a number of issues related to the spatial distribution of contaminants from the Chernobyl accident. The effects of local statistics were addressed by collecting and analyzing (for Cesium 137) soil samples from a number of regions, and it was found that sample activity differed by a factor of 3-5. The effect of local non-uniformity was estimated by modeling the distribution of the average activity of a set of five samples for each of the regions, with the spread in the activities for a {+-}2 range being equal to 25%. The statistical characteristicsmore » of the distribution of contamination were then analyzed and found to be a log-normal distribution with the standard deviation being a function of test area. All data for the Bryanskaya Oblast area were analyzed statistically and were adequately described by a log-normal function.« less

  16. Human Target Attainment Probabilities for Delafloxacin against Escherichia coli and Pseudomonas aeruginosa

    PubMed Central

    Hoover, Randall; Marra, Andrea; Duffy, Erin; Cammarata, Sue K

    2017-01-01

    Abstract Background Delafloxacin (DLX) is a broad-spectrum fluoroquinolone antibiotic under FDA review for the treatment of ABSSSI. Previous studies determined DLX bacterial stasis and 1-log10 bacterial reduction free AUC0-24 / MIC (fAUC0-24/MIC) targets for Escherichia coli (EC) and Pseudomonas aeruginosa (PA) in a mouse thigh infection model. The resulting PK/PD targets were used to predict DLX target attainment probabilities (TAP) in humans. Methods Monte Carlo simulations were used to estimate TAP with DLX 300 mg IV, q12hr. Human DLX plasma pharmacokinetics were determined in patients with ABSSSI in a Phase 3 clinical trial. Individual AUC values were analyzed and determined to be log-normally distributed. The parameters of the AUC distribution were used to simulate random values for fAUC24, which then were combined with random MIC values based on 2014–2015 US distributions of skin and soft tissue isolates of EC (n = 108) and PA (n = 40), to calculate PK/PD TAPs. Results DLX fAUC0-24/MIC targets for bacterial stasis and 1-log10 bacterial reduction for EC were 14.5 and 26.2, and for PA were 3.81 and 5.02, respectively. The Monte Carlo simulations for EC predicted TAPs of 98.7% for stasis at an MIC of 0.25 μg/mL, and 99.3% for 1-log10 bacterial reduction at an MIC of 0.12 μg/mL. The simulations for PA predicted TAPs of 97.3% for stasis and 86.5% for 1-log10 bacterial reduction at an MIC of 1 μg/mL. E. coli MIC (ug/mL) Target 0.008 0.015 0.03 0.06 0.12 0.25 0.5 1 Stasis 100 100 100 100 100 97.8 50.4 2.0 1-Log Kill 100 100 100 100 99.3 60.4 5.8 0.0 P. aeruginosa MIC (ug/mL) Target 0.03 0.06 0.12 0.25 0.5 1 2 4 5 Stasis 100 100 100 100 100 97.3 45.9 1.7 0.5 1-Log Kill 100 100 100 100 100 86.5 17.8 0.3 0.1 Conclusion DLX 300 mg IV, q12hr, should achieve fAUC24/MIC ratios that are adequate to treat ABSSSI caused by most contemporary isolates of EC and PA. For EC, isolates with DLX MICs ≤0.25 μg/mL comprised 73% of all isolates. For PA, isolates with DLX MICs ≤1 μg/mL comprised 88% of all isolates. Similar results would be expected for TAP with oral DLX 450 mg, q12hr. Disclosures R. Hoover, Melinta Therapeutics: Consultant, Consulting fee; A. Marra, Melinta Therapeutics: Employee, Salary; E. Duffy, Melinta Therapeutics: Employee, Salary; S. K. Cammarata, Melinta Therapeutics: Employee, Salary

  17. Stochastic Modeling Approach to the Incubation Time of Prionic Diseases

    NASA Astrophysics Data System (ADS)

    Ferreira, A. S.; da Silva, M. A.; Cressoni, J. C.

    2003-05-01

    Transmissible spongiform encephalopathies are neurodegenerative diseases for which prions are the attributed pathogenic agents. A widely accepted theory assumes that prion replication is due to a direct interaction between the pathologic (PrPSc) form and the host-encoded (PrPC) conformation, in a kind of autocatalytic process. Here we show that the overall features of the incubation time of prion diseases are readily obtained if the prion reaction is described by a simple mean-field model. An analytical expression for the incubation time distribution then follows by associating the rate constant to a stochastic variable log normally distributed. The incubation time distribution is then also shown to be log normal and fits the observed BSE (bovine spongiform encephalopathy) data very well. Computer simulation results also yield the correct BSE incubation time distribution at low PrPC densities.

  18. Parametric vs. non-parametric statistics of low resolution electromagnetic tomography (LORETA).

    PubMed

    Thatcher, R W; North, D; Biver, C

    2005-01-01

    This study compared the relative statistical sensitivity of non-parametric and parametric statistics of 3-dimensional current sources as estimated by the EEG inverse solution Low Resolution Electromagnetic Tomography (LORETA). One would expect approximately 5% false positives (classification of a normal as abnormal) at the P < .025 level of probability (two tailed test) and approximately 1% false positives at the P < .005 level. EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) from 43 normal adult subjects were imported into the Key Institute's LORETA program. We then used the Key Institute's cross-spectrum and the Key Institute's LORETA output files (*.lor) as the 2,394 gray matter pixel representation of 3-dimensional currents at different frequencies. The mean and standard deviation *.lor files were computed for each of the 2,394 gray matter pixels for each of the 43 subjects. Tests of Gaussianity and different transforms were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of parametric vs. non-parametric statistics were compared using a "leave-one-out" cross validation method in which individual normal subjects were withdrawn and then statistically classified as being either normal or abnormal based on the remaining subjects. Log10 transforms approximated Gaussian distribution in the range of 95% to 99% accuracy. Parametric Z score tests at P < .05 cross-validation demonstrated an average misclassification rate of approximately 4.25%, and range over the 2,394 gray matter pixels was 27.66% to 0.11%. At P < .01 parametric Z score cross-validation false positives were 0.26% and ranged from 6.65% to 0% false positives. The non-parametric Key Institute's t-max statistic at P < .05 had an average misclassification error rate of 7.64% and ranged from 43.37% to 0.04% false positives. The nonparametric t-max at P < .01 had an average misclassification rate of 6.67% and ranged from 41.34% to 0% false positives of the 2,394 gray matter pixels for any cross-validated normal subject. In conclusion, adequate approximation to Gaussian distribution and high cross-validation can be achieved by the Key Institute's LORETA programs by using a log10 transform and parametric statistics, and parametric normative comparisons had lower false positive rates than the non-parametric tests.

  19. The Statistical Fermi Paradox

    NASA Astrophysics Data System (ADS)

    Maccone, C.

    In this paper is provided the statistical generalization of the Fermi paradox. The statistics of habitable planets may be based on a set of ten (and possibly more) astrobiological requirements first pointed out by Stephen H. Dole in his book Habitable planets for man (1964). The statistical generalization of the original and by now too simplistic Dole equation is provided by replacing a product of ten positive numbers by the product of ten positive random variables. This is denoted the SEH, an acronym standing for “Statistical Equation for Habitables”. The proof in this paper is based on the Central Limit Theorem (CLT) of Statistics, stating that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable (Lyapunov form of the CLT). It is then shown that: 1. The new random variable NHab, yielding the number of habitables (i.e. habitable planets) in the Galaxy, follows the log- normal distribution. By construction, the mean value of this log-normal distribution is the total number of habitable planets as given by the statistical Dole equation. 2. The ten (or more) astrobiological factors are now positive random variables. The probability distribution of each random variable may be arbitrary. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into the SEH by allowing an arbitrary probability distribution for each factor. This is both astrobiologically realistic and useful for any further investigations. 3. By applying the SEH it is shown that the (average) distance between any two nearby habitable planets in the Galaxy may be shown to be inversely proportional to the cubic root of NHab. This distance is denoted by new random variable D. The relevant probability density function is derived, which was named the "Maccone distribution" by Paul Davies in 2008. 4. A practical example is then given of how the SEH works numerically. Each of the ten random variables is uniformly distributed around its own mean value as given by Dole (1964) and a standard deviation of 10% is assumed. The conclusion is that the average number of habitable planets in the Galaxy should be around 100 million ±200 million, and the average distance in between any two nearby habitable planets should be about 88 light years ±40 light years. 5. The SEH results are matched against the results of the Statistical Drake Equation from reference 4. As expected, the number of currently communicating ET civilizations in the Galaxy turns out to be much smaller than the number of habitable planets (about 10,000 against 100 million, i.e. one ET civilization out of 10,000 habitable planets). The average distance between any two nearby habitable planets is much smaller that the average distance between any two neighbouring ET civilizations: 88 light years vs. 2000 light years, respectively. This means an ET average distance about 20 times higher than the average distance between any pair of adjacent habitable planets. 6. Finally, a statistical model of the Fermi Paradox is derived by applying the above results to the coral expansion model of Galactic colonization. The symbolic manipulator "Macsyma" is used to solve these difficult equations. A new random variable Tcol, representing the time needed to colonize a new planet is introduced, which follows the lognormal distribution, Then the new quotient random variable Tcol/D is studied and its probability density function is derived by Macsyma. Finally a linear transformation of random variables yields the overall time TGalaxy needed to colonize the whole Galaxy. We believe that our mathematical work in deriving this STATISTICAL Fermi Paradox is highly innovative and fruitful for the future.

  20. Spatiotemporal stick-slip phenomena in a coupled continuum-granular system

    NASA Astrophysics Data System (ADS)

    Ecke, Robert

    In sheared granular media, stick-slip behavior is ubiquitous, especially at very small shear rates and weak drive coupling. The resulting slips are characteristic of natural phenomena such as earthquakes and well as being a delicate probe of the collective dynamics of the granular system. In that spirit, we developed a laboratory experiment consisting of sheared elastic plates separated by a narrow gap filled with quasi-two-dimensional granular material (bi-dispersed nylon rods) . We directly determine the spatial and temporal distributions of strain displacements of the elastic continuum over 200 spatial points located adjacent to the gap. Slip events can be divided into large system-spanning events and spatially distributed smaller events. The small events have a probability distribution of event moment consistent with an M - 3 / 2 power law scaling and a Poisson distributed recurrence time distribution. Large events have a broad, log-normal moment distribution and a mean repetition time. As the applied normal force increases, there are fractionally more (less) large (small) events, and the large-event moment distribution broadens. The magnitude of the slip motion of the plates is well correlated with the root-mean-square displacements of the granular matter. Our results are consistent with mean field descriptions of statistical models of earthquakes and avalanches. We further explore the high-speed dynamics of system events and also discuss the effective granular friction of the sheared layer. We find that large events result from stored elastic energy in the plates in this coupled granular-continuum system.

  1. Stochastic approach to the derivation of emission limits for wastewater treatment plants.

    PubMed

    Stransky, D; Kabelkova, I; Bares, V

    2009-01-01

    Stochastic approach to the derivation of WWTP emission limits meeting probabilistically defined environmental quality standards (EQS) is presented. The stochastic model is based on the mixing equation with input data defined by probability density distributions and solved by Monte Carlo simulations. The approach was tested on a study catchment for total phosphorus (P(tot)). The model assumes input variables independency which was proved for the dry-weather situation. Discharges and P(tot) concentrations both in the study creek and WWTP effluent follow log-normal probability distribution. Variation coefficients of P(tot) concentrations differ considerably along the stream (c(v)=0.415-0.884). The selected value of the variation coefficient (c(v)=0.420) affects the derived mean value (C(mean)=0.13 mg/l) of the P(tot) EQS (C(90)=0.2 mg/l). Even after supposed improvement of water quality upstream of the WWTP to the level of the P(tot) EQS, the WWTP emission limits calculated would be lower than the values of the best available technology (BAT). Thus, minimum dilution ratios for the meaningful application of the combined approach to the derivation of P(tot) emission limits for Czech streams are discussed.

  2. Estimation of design floods in ungauged catchments using a regional index flood method. A case study of Lake Victoria Basin in Kenya

    NASA Astrophysics Data System (ADS)

    Nobert, Joel; Mugo, Margaret; Gadain, Hussein

    Reliable estimation of flood magnitudes corresponding to required return periods, vital for structural design purposes, is impacted by lack of hydrological data in the study area of Lake Victoria Basin in Kenya. Use of regional information, derived from data at gauged sites and regionalized for use at any location within a homogenous region, would improve the reliability of the design flood estimation. Therefore, the regional index flood method has been applied. Based on data from 14 gauged sites, a delineation of the basin into two homogenous regions was achieved using elevation variation (90-m DEM), spatial annual rainfall pattern and Principal Component Analysis of seasonal rainfall patterns (from 94 rainfall stations). At site annual maximum series were modelled using the Log normal (LN) (3P), Log Logistic Distribution (LLG), Generalized Extreme Value (GEV) and Log Pearson Type 3 (LP3) distributions. The parameters of the distributions were estimated using the method of probability weighted moments. Goodness of fit tests were applied and the GEV was identified as the most appropriate model for each site. Based on the GEV model, flood quantiles were estimated and regional frequency curves derived from the averaged at site growth curves. Using the least squares regression method, relationships were developed between the index flood, which is defined as the Mean Annual Flood (MAF) and catchment characteristics. The relationships indicated area, mean annual rainfall and altitude were the three significant variables that greatly influence the index flood. Thereafter, estimates of flood magnitudes in ungauged catchments within a homogenous region were estimated from the derived equations for index flood and quantiles from the regional curves. These estimates will improve flood risk estimation and to support water management and engineering decisions and actions.

  3. 3D modeling of effects of increased oxygenation and activity concentration in tumors treated with radionuclides and antiangiogenic drugs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lagerloef, Jakob H.; Kindblom, Jon; Bernhardt, Peter

    Purpose: Formation of new blood vessels (angiogenesis) in response to hypoxia is a fundamental event in the process of tumor growth and metastatic dissemination. However, abnormalities in tumor neovasculature often induce increased interstitial pressure (IP) and further reduce oxygenation (pO{sub 2}) of tumor cells. In radiotherapy, well-oxygenated tumors favor treatment. Antiangiogenic drugs may lower IP in the tumor, improving perfusion, pO{sub 2} and drug uptake, by reducing the number of malfunctioning vessels in the tissue. This study aims to create a model for quantifying the effects of altered pO{sub 2}-distribution due to antiangiogenic treatment in combination with radionuclide therapy. Methods:more » Based on experimental data, describing the effects of antiangiogenic agents on oxygenation of GlioblastomaMultiforme (GBM), a single cell based 3D model, including 10{sup 10} tumor cells, was developed, showing how radionuclide therapy response improves as tumor oxygenation approaches normal tissue levels. The nuclides studied were {sup 90}Y, {sup 131}I, {sup 177}Lu, and {sup 211}At. The absorbed dose levels required for a tumor control probability (TCP) of 0.990 are compared for three different log-normal pO{sub 2}-distributions: {mu}{sub 1} = 2.483, {sigma}{sub 1} = 0.711; {mu}{sub 2} = 2.946, {sigma}{sub 2} = 0.689; {mu}{sub 3} = 3.689, and {sigma}{sub 3} = 0.330. The normal tissue absorbed doses will, in turn, depend on this. These distributions were chosen to represent the expected oxygen levels in an untreated hypoxic tumor, a hypoxic tumor treated with an anti-VEGF agent, and in normal, fully-oxygenated tissue, respectively. The former two are fitted to experimental data. The geometric oxygen distributions are simulated using two different patterns: one Monte Carlo based and one radially increasing, while keeping the log-normal volumetric distributions intact. Oxygen and activity are distributed, according to the same pattern. Results: As tumor pO{sub 2} approaches normal tissue levels, the therapeutic effect is improved so that the normal tissue absorbed doses can be decreased by more than 95%, while retaining TCP, in the most favorable scenario and by up to about 80% with oxygen levels previously achieved in vivo, when the least favourable oxygenation case is used as starting point. The major difference occurs in poorly oxygenated cells. This is also where the pO{sub 2}-dependence of the oxygen enhancement ratio is maximal. Conclusions: Improved tumor oxygenation together with increased radionuclide uptake show great potential for optimising treatment strategies, leaving room for successive treatments, or lowering absorbed dose to normal tissues, due to increased tumor response. Further studies of the concomitant use of antiangiogenic drugs and radionuclide therapy therefore appear merited.« less

  4. Improvement of Reynolds-Stress and Triple-Product Lag Models

    NASA Technical Reports Server (NTRS)

    Olsen, Michael E.; Lillard, Randolph P.

    2017-01-01

    The Reynolds-stress and triple product Lag models were created with a normal stress distribution which was denied by a 4:3:2 distribution of streamwise, spanwise and wall normal stresses, and a ratio of r(sub w) = 0.3k in the log layer region of high Reynolds number flat plate flow, which implies R11(+)= [4/(9/2)*.3] approximately 2.96. More recent measurements show a more complex picture of the log layer region at high Reynolds numbers. The first cut at improving these models along with the direction for future refinements is described. Comparison with recent high Reynolds number data shows areas where further work is needed, but also shows inclusion of the modeled turbulent transport terms improve the prediction where they influence the solution. Additional work is needed to make the model better match experiment, but there is significant improvement in many of the details of the log layer behavior.

  5. 210Po Log-normal distribution in human urines: Survey from Central Italy people

    PubMed Central

    Sisti, D.; Rocchi, M. B. L.; Meli, M. A.; Desideri, D.

    2009-01-01

    The death in London of the former secret service agent Alexander Livtinenko on 23 November 2006 generally attracted the attention of the public to the rather unknown radionuclide 210Po. This paper presents the results of a monitoring programme of 210Po background levels in the urines of noncontaminated people living in Central Italy (near the Republic of S. Marino). The relationship between age, sex, years of smoking, number of cigarettes per day, and 210Po concentration was also studied. The results indicated that the urinary 210Po concentration follows a surprisingly perfect Log-normal distribution. Log 210Po concentrations were positively correlated to age (p < 0.0001), number of daily smoked cigarettes (p = 0.006), and years of smoking (p = 0.021), and associated to sex (p = 0.019). Consequently, this study provides upper reference limits for each sub-group identified by significantly predictive variables. PMID:19750019

  6. Mean Excess Function as a method of identifying sub-exponential tails: Application to extreme daily rainfall

    NASA Astrophysics Data System (ADS)

    Nerantzaki, Sofia; Papalexiou, Simon Michael

    2017-04-01

    Identifying precisely the distribution tail of a geophysical variable is tough, or, even impossible. First, the tail is the part of the distribution for which we have the less empirical information available; second, a universally accepted definition of tail does not and cannot exist; and third, a tail may change over time due to long-term changes. Unfortunately, the tail is the most important part of the distribution as it dictates the estimates of exceedance probabilities or return periods. Fortunately, based on their tail behavior, probability distributions can be generally categorized into two major families, i.e., sub-exponentials (heavy-tailed) and hyper-exponentials (light-tailed). This study aims to update the Mean Excess Function (MEF), providing a useful tool in order to asses which type of tail better describes empirical data. The MEF is based on the mean value of a variable over a threshold and results in a zero slope regression line when applied for the Exponential distribution. Here, we construct slope confidence intervals for the Exponential distribution as functions of sample size. The validation of the method using Monte Carlo techniques on four theoretical distributions covering major tail cases (Pareto type II, Log-normal, Weibull and Gamma) revealed that it performs well especially for large samples. Finally, the method is used to investigate the behavior of daily rainfall extremes; thousands of rainfall records were examined, from all over the world and with sample size over 100 years, revealing that heavy-tailed distributions can describe more accurately rainfall extremes.

  7. Multiplicative processes in visual cognition

    NASA Astrophysics Data System (ADS)

    Credidio, H. F.; Teixeira, E. N.; Reis, S. D. S.; Moreira, A. A.; Andrade, J. S.

    2014-03-01

    The Central Limit Theorem (CLT) is certainly one of the most important results in the field of statistics. The simple fact that the addition of many random variables can generate the same probability curve, elucidated the underlying process for a broad spectrum of natural systems, ranging from the statistical distribution of human heights to the distribution of measurement errors, to mention a few. An extension of the CLT can be applied to multiplicative processes, where a given measure is the result of the product of many random variables. The statistical signature of these processes is rather ubiquitous, appearing in a diverse range of natural phenomena, including the distributions of incomes, body weights, rainfall, and fragment sizes in a rock crushing process. Here we corroborate results from previous studies which indicate the presence of multiplicative processes in a particular type of visual cognition task, namely, the visual search for hidden objects. Precisely, our results from eye-tracking experiments show that the distribution of fixation times during visual search obeys a log-normal pattern, while the fixational radii of gyration follow a power-law behavior.

  8. Study on probability distribution of prices in electricity market: A case study of zhejiang province, china

    NASA Astrophysics Data System (ADS)

    Zhou, H.; Chen, B.; Han, Z. X.; Zhang, F. Q.

    2009-05-01

    The study on probability density function and distribution function of electricity prices contributes to the power suppliers and purchasers to estimate their own management accurately, and helps the regulator monitor the periods deviating from normal distribution. Based on the assumption of normal distribution load and non-linear characteristic of the aggregate supply curve, this paper has derived the distribution of electricity prices as the function of random variable of load. The conclusion has been validated with the electricity price data of Zhejiang market. The results show that electricity prices obey normal distribution approximately only when supply-demand relationship is loose, whereas the prices deviate from normal distribution and present strong right-skewness characteristic. Finally, the real electricity markets also display the narrow-peak characteristic when undersupply occurs.

  9. Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities

    USGS Publications Warehouse

    Asquith, William H.; Kiang, Julie E.; Cohn, Timothy A.

    2017-07-17

    The U.S. Geological Survey (USGS), in cooperation with the U.S. Nuclear Regulatory Commission, has investigated statistical methods for probabilistic flood hazard assessment to provide guidance on very low annual exceedance probability (AEP) estimation of peak-streamflow frequency and the quantification of corresponding uncertainties using streamgage-specific data. The term “very low AEP” implies exceptionally rare events defined as those having AEPs less than about 0.001 (or 1 × 10–3 in scientific notation or for brevity 10–3). Such low AEPs are of great interest to those involved with peak-streamflow frequency analyses for critical infrastructure, such as nuclear power plants. Flood frequency analyses at streamgages are most commonly based on annual instantaneous peak streamflow data and a probability distribution fit to these data. The fitted distribution provides a means to extrapolate to very low AEPs. Within the United States, the Pearson type III probability distribution, when fit to the base-10 logarithms of streamflow, is widely used, but other distribution choices exist. The USGS-PeakFQ software, implementing the Pearson type III within the Federal agency guidelines of Bulletin 17B (method of moments) and updates to the expected moments algorithm (EMA), was specially adapted for an “Extended Output” user option to provide estimates at selected AEPs from 10–3 to 10–6. Parameter estimation methods, in addition to product moments and EMA, include L-moments, maximum likelihood, and maximum product of spacings (maximum spacing estimation). This study comprehensively investigates multiple distributions and parameter estimation methods for two USGS streamgages (01400500 Raritan River at Manville, New Jersey, and 01638500 Potomac River at Point of Rocks, Maryland). The results of this study specifically involve the four methods for parameter estimation and up to nine probability distributions, including the generalized extreme value, generalized log-normal, generalized Pareto, and Weibull. Uncertainties in streamflow estimates for corresponding AEP are depicted and quantified as two primary forms: quantile (aleatoric [random sampling] uncertainty) and distribution-choice (epistemic [model] uncertainty). Sampling uncertainties of a given distribution are relatively straightforward to compute from analytical or Monte Carlo-based approaches. Distribution-choice uncertainty stems from choices of potentially applicable probability distributions for which divergence among the choices increases as AEP decreases. Conventional goodness-of-fit statistics, such as Cramér-von Mises, and L-moment ratio diagrams are demonstrated in order to hone distribution choice. The results generally show that distribution choice uncertainty is larger than sampling uncertainty for very low AEP values.

  10. Empirical study of the tails of mutual fund size

    NASA Astrophysics Data System (ADS)

    Schwarzkopf, Yonathan; Farmer, J. Doyne

    2010-06-01

    The mutual fund industry manages about a quarter of the assets in the U.S. stock market and thus plays an important role in the U.S. economy. The question of how much control is concentrated in the hands of the largest players is best quantitatively discussed in terms of the tail behavior of the mutual fund size distribution. We study the distribution empirically and show that the tail is much better described by a log-normal than a power law, indicating less concentration than, for example, personal income. The results are highly statistically significant and are consistent across fifteen years. This contradicts a recent theory concerning the origin of the power law tails of the trading volume distribution. Based on the analysis in a companion paper, the log-normality is to be expected, and indicates that the distribution of mutual funds remains perpetually out of equilibrium.

  11. An evaluation of procedures to estimate monthly precipitation probabilities

    NASA Astrophysics Data System (ADS)

    Legates, David R.

    1991-01-01

    Many frequency distributions have been used to evaluate monthly precipitation probabilities. Eight of these distributions (including Pearson type III, extreme value, and transform normal probability density functions) are comparatively examined to determine their ability to represent accurately variations in monthly precipitation totals for global hydroclimatological analyses. Results indicate that a modified version of the Box-Cox transform-normal distribution more adequately describes the 'true' precipitation distribution than does any of the other methods. This assessment was made using a cross-validation procedure for a global network of 253 stations for which at least 100 years of monthly precipitation totals were available.

  12. On the log-normality of historical magnetic-storm intensity statistics: implications for extreme-event probabilities

    USGS Publications Warehouse

    Love, Jeffrey J.; Rigler, E. Joshua; Pulkkinen, Antti; Riley, Pete

    2015-01-01

    An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to −Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, −Dst≥850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42,2.41] times per century; a 100-yr magnetic storm is identified as having a −Dst≥880 nT (greater than Carrington) but a wide 95% confidence interval of [490,1187] nT.

  13. Are CO Observations of Interstellar Clouds Tracing the H2?

    NASA Astrophysics Data System (ADS)

    Federrath, Christoph; Glover, S. C. O.; Klessen, R. S.; Mac Low, M.

    2010-01-01

    Interstellar clouds are commonly observed through the emission of rotational transitions from carbon monoxide (CO). However, the abundance ratio of CO to molecular hydrogen (H2), which is the most abundant molecule in molecular clouds is only about 10-4. This raises the important question of whether the observed CO emission is actually tracing the bulk of the gas in these clouds, and whether it can be used to derive quantities like the total mass of the cloud, the gas density distribution function, the fractal dimension, and the velocity dispersion--size relation. To evaluate the usability and accuracy of CO as a tracer for H2 gas, we generate synthetic observations of hydrodynamical models that include a detailed chemical network to follow the formation and photo-dissociation of H2 and CO. These three-dimensional models of turbulent interstellar cloud formation self-consistently follow the coupled thermal, dynamical and chemical evolution of 32 species, with a particular focus on H2 and CO (Glover et al. 2009). We find that CO primarily traces the dense gas in the clouds, however, with a significant scatter due to turbulent mixing and self-shielding of H2 and CO. The H2 probability distribution function (PDF) is well-described by a log-normal distribution. In contrast, the CO column density PDF has a strongly non-Gaussian low-density wing, not at all consistent with a log-normal distribution. Centroid velocity statistics show that CO is more intermittent than H2, leading to an overestimate of the velocity scaling exponent in the velocity dispersion--size relation. With our systematic comparison of H2 and CO data from the numerical models, we hope to provide a statistical formula to correct for the bias of CO observations. CF acknowledges financial support from a Kade Fellowship of the American Museum of Natural History.

  14. Exact Extremal Statistics in the Classical 1D Coulomb Gas

    NASA Astrophysics Data System (ADS)

    Dhar, Abhishek; Kundu, Anupam; Majumdar, Satya N.; Sabhapandit, Sanjib; Schehr, Grégory

    2017-08-01

    We consider a one-dimensional classical Coulomb gas of N -like charges in a harmonic potential—also known as the one-dimensional one-component plasma. We compute, analytically, the probability distribution of the position xmax of the rightmost charge in the limit of large N . We show that the typical fluctuations of xmax around its mean are described by a nontrivial scaling function, with asymmetric tails. This distribution is different from the Tracy-Widom distribution of xmax for Dyson's log gas. We also compute the large deviation functions of xmax explicitly and show that the system exhibits a third-order phase transition, as in the log gas. Our theoretical predictions are verified numerically.

  15. Clinical impact of dosimetric changes for volumetric modulated arc therapy in log file-based patient dose calculations.

    PubMed

    Katsuta, Yoshiyuki; Kadoya, Noriyuki; Fujita, Yukio; Shimizu, Eiji; Matsunaga, Kenichi; Matsushita, Haruo; Majima, Kazuhiro; Jingu, Keiichi

    2017-10-01

    A log file-based method cannot detect dosimetric changes due to linac component miscalibration because log files are insensitive to miscalibration. Herein, clinical impacts of dosimetric changes on a log file-based method were determined. Five head-and-neck and five prostate plans were applied. Miscalibration-simulated log files were generated by inducing a linac component miscalibration into the log file. Miscalibration magnitudes for leaf, gantry, and collimator at the general tolerance level were ±0.5mm, ±1°, and ±1°, respectively, and at a tighter tolerance level achievable on current linac were ±0.3mm, ±0.5°, and ±0.5°, respectively. Re-calculations were performed on patient anatomy using log file data. Changes in tumor control probability/normal tissue complication probability from treatment planning system dose to re-calculated dose at the general tolerance level was 1.8% on planning target volume (PTV) and 2.4% on organs at risk (OARs) in both plans. These changes at the tighter tolerance level were improved to 1.0% on PTV and to 1.5% on OARs, with a statistically significant difference. We determined the clinical impacts of dosimetric changes on a log file-based method using a general tolerance level and a tighter tolerance level for linac miscalibration and found that a tighter tolerance level significantly improved the accuracy of the log file-based method. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  16. Estimation of distributional parameters for censored trace level water quality data: 1. Estimation techniques

    USGS Publications Warehouse

    Gilliom, Robert J.; Helsel, Dennis R.

    1986-01-01

    A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.

  17. Estimation of distributional parameters for censored trace level water quality data. 1. Estimation Techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilliom, R.J.; Helsel, D.R.

    1986-02-01

    A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less

  18. Estimation of distributional parameters for censored trace-level water-quality data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilliom, R.J.; Helsel, D.R.

    1984-01-01

    A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less

  19. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  20. A comparison of reliability and conventional estimation of safe fatigue life and safe inspection intervals

    NASA Technical Reports Server (NTRS)

    Hooke, F. H.

    1972-01-01

    Both the conventional and reliability analyses for determining safe fatigue life are predicted on a population having a specified (usually log normal) distribution of life to collapse under a fatigue test load. Under a random service load spectrum, random occurrences of load larger than the fatigue test load may confront and cause collapse of structures which are weakened, though not yet to the fatigue test load. These collapses are included in reliability but excluded in conventional analysis. The theory of risk determination by each method is given, and several reasonably typical examples have been worked out, in which it transpires that if one excludes collapse through exceedance of the uncracked strength, the reliability and conventional analyses gave virtually identical probabilities of failure or survival.

  1. SIMULATED HUMAN ERROR PROBABILITY AND ITS APPLICATION TO DYNAMIC HUMAN FAILURE EVENTS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herberger, Sarah M.; Boring, Ronald L.

    Abstract Objectives: Human reliability analysis (HRA) methods typically analyze human failure events (HFEs) at the overall task level. For dynamic HRA, it is important to model human activities at the subtask level. There exists a disconnect between dynamic subtask level and static task level that presents issues when modeling dynamic scenarios. For example, the SPAR-H method is typically used to calculate the human error probability (HEP) at the task level. As demonstrated in this paper, quantification in SPAR-H does not translate to the subtask level. Methods: Two different discrete distributions were generated for each SPAR-H Performance Shaping Factor (PSF) tomore » define the frequency of PSF levels. The first distribution was a uniform, or uninformed distribution that assumed the frequency of each PSF level was equally likely. The second non-continuous distribution took the frequency of PSF level as identified from an assessment of the HERA database. These two different approaches were created to identify the resulting distribution of the HEP. The resulting HEP that appears closer to the known distribution, a log-normal centered on 1E-3, is the more desirable. Each approach then has median, average and maximum HFE calculations applied. To calculate these three values, three events, A, B and C are generated from the PSF level frequencies comprised of subtasks. The median HFE selects the median PSF level from each PSF and calculates HEP. The average HFE takes the mean PSF level, and the maximum takes the maximum PSF level. The same data set of subtask HEPs yields starkly different HEPs when aggregated to the HFE level in SPAR-H. Results: Assuming that each PSF level in each HFE is equally likely creates an unrealistic distribution of the HEP that is centered at 1. Next the observed frequency of PSF levels was applied with the resulting HEP behaving log-normally with a majority of the values under 2.5% HEP. The median, average and maximum HFE calculations did yield different answers for the HFE. The HFE maximum grossly over estimates the HFE, while the HFE distribution occurs less than HFE median, and greater than HFE average. Conclusions: Dynamic task modeling can be perused through the framework of SPAR-H. Identification of distributions associated with each PSF needs to be defined, and may change depending upon the scenario. However it is very unlikely that each PSF level is equally likely as the resulting HEP distribution is strongly centered at 100%, which is unrealistic. Other distributions may need to be identified for PSFs, to facilitate the transition to dynamic task modeling. Additionally discrete distributions need to be exchanged for continuous so that simulations for the HFE can further advance. This paper provides a method to explore dynamic subtask to task translation and provides examples of the process using the SPAR-H method.« less

  2. Log file-based patient dose calculations of double-arc VMAT for head-and-neck radiotherapy.

    PubMed

    Katsuta, Yoshiyuki; Kadoya, Noriyuki; Fujita, Yukio; Shimizu, Eiji; Majima, Kazuhiro; Matsushita, Haruo; Takeda, Ken; Jingu, Keiichi

    2018-04-01

    The log file-based method cannot display dosimetric changes due to linac component miscalibration because of the insensitivity of log files to linac component miscalibration. The purpose of this study was to supply dosimetric changes in log file-based patient dose calculations for double-arc volumetric-modulated arc therapy (VMAT) in head-and-neck cases. Fifteen head-and-neck cases participated in this study. For each case, treatment planning system (TPS) doses were produced by double-arc and single-arc VMAT. Miscalibration-simulated log files were generated by inducing a leaf miscalibration of ±0.5 mm into the log files that were acquired during VMAT irradiation. Subsequently, patient doses were estimated using the miscalibration-simulated log files. For double-arc VMAT, regarding planning target volume (PTV), the change from TPS dose to miscalibration-simulated log file dose in D mean was 0.9 Gy and that for tumor control probability was 1.4%. As for organ-at-risks (OARs), the change in D mean was <0.7 Gy and normal tissue complication probability was <1.8%. A comparison between double-arc and single-arc VMAT for PTV showed statistically significant differences in the changes evaluated by D mean and radiobiological metrics (P < 0.01), even though the magnitude of these differences was small. Similarly, for OARs, the magnitude of these changes was found to be small. Using the log file-based method for PTV and OARs, the log file-based method estimate of patient dose using the double-arc VMAT has accuracy comparable to that obtained using the single-arc VMAT. Copyright © 2018 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  3. Tangled nature model of evolutionary dynamics reconsidered: Structural and dynamical effects of trait inheritance

    NASA Astrophysics Data System (ADS)

    Andersen, Christian Walther; Sibani, Paolo

    2016-05-01

    Based on the stochastic dynamics of interacting agents which reproduce, mutate, and die, the tangled nature model (TNM) describes key emergent features of biological and cultural ecosystems' evolution. While trait inheritance is not included in many applications, i.e., the interactions of an agent and those of its mutated offspring are taken to be uncorrelated, in the family of TNMs introduced in this work correlations of varying strength are parametrized by a positive integer K . We first show that the interactions generated by our rule are nearly independent of K . Consequently, the structural and dynamical effects of trait inheritance can be studied independently of effects related to the form of the interactions. We then show that changing K strengthens the core structure of the ecology, leads to population abundance distributions better approximated by log-normal probability densities, and increases the probability that a species extant at time tw also survives at t >tw . Finally, survival probabilities of species are shown to decay as powers of the ratio t /tw , a so-called pure aging behavior usually seen in glassy systems of physical origin. We find a quantitative dynamical effect of trait inheritance, namely, that increasing the value of K numerically decreases the decay exponent of the species survival probability.

  4. Tangled nature model of evolutionary dynamics reconsidered: Structural and dynamical effects of trait inheritance.

    PubMed

    Andersen, Christian Walther; Sibani, Paolo

    2016-05-01

    Based on the stochastic dynamics of interacting agents which reproduce, mutate, and die, the tangled nature model (TNM) describes key emergent features of biological and cultural ecosystems' evolution. While trait inheritance is not included in many applications, i.e., the interactions of an agent and those of its mutated offspring are taken to be uncorrelated, in the family of TNMs introduced in this work correlations of varying strength are parametrized by a positive integer K. We first show that the interactions generated by our rule are nearly independent of K. Consequently, the structural and dynamical effects of trait inheritance can be studied independently of effects related to the form of the interactions. We then show that changing K strengthens the core structure of the ecology, leads to population abundance distributions better approximated by log-normal probability densities, and increases the probability that a species extant at time t_{w} also survives at t>t_{w}. Finally, survival probabilities of species are shown to decay as powers of the ratio t/t_{w}, a so-called pure aging behavior usually seen in glassy systems of physical origin. We find a quantitative dynamical effect of trait inheritance, namely, that increasing the value of K numerically decreases the decay exponent of the species survival probability.

  5. Statistical distributions of ultra-low dose CT sinograms and their fundamental limits

    NASA Astrophysics Data System (ADS)

    Lee, Tzu-Cheng; Zhang, Ruoqiao; Alessio, Adam M.; Fu, Lin; De Man, Bruno; Kinahan, Paul E.

    2017-03-01

    Low dose CT imaging is typically constrained to be diagnostic. However, there are applications for even lowerdose CT imaging, including image registration across multi-frame CT images and attenuation correction for PET/CT imaging. We define this as the ultra-low-dose (ULD) CT regime where the exposure level is a factor of 10 lower than current low-dose CT technique levels. In the ULD regime it is possible to use statistically-principled image reconstruction methods that make full use of the raw data information. Since most statistical based iterative reconstruction methods are based on the assumption of that post-log noise distribution is close to Poisson or Gaussian, our goal is to understand the statistical distribution of ULD CT data with different non-positivity correction methods, and to understand when iterative reconstruction methods may be effective in producing images that are useful for image registration or attenuation correction in PET/CT imaging. We first used phantom measurement and calibrated simulation to reveal how the noise distribution deviate from normal assumption under the ULD CT flux environment. In summary, our results indicate that there are three general regimes: (1) Diagnostic CT, where post-log data are well modeled by normal distribution. (2) Lowdose CT, where normal distribution remains a reasonable approximation and statistically-principled (post-log) methods that assume a normal distribution have an advantage. (3) An ULD regime that is photon-starved and the quadratic approximation is no longer effective. For instance, a total integral density of 4.8 (ideal pi for 24 cm of water) for 120kVp, 0.5mAs of radiation source is the maximum pi value where a definitive maximum likelihood value could be found. This leads to fundamental limits in the estimation of ULD CT data when using a standard data processing stream

  6. Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study.

    PubMed

    Liu, Lei; Strawderman, Robert L; Johnson, Bankole A; O'Quigley, John M

    2016-02-01

    Two-part random effects models (Olsen and Schafer,(1) Tooze et al.(2)) have been applied to repeated measures of semi-continuous data, characterized by a mixture of a substantial proportion of zero values and a skewed distribution of positive values. In the original formulation of this model, the natural logarithm of the positive values is assumed to follow a normal distribution with a constant variance parameter. In this article, we review and consider three extensions of this model, allowing the positive values to follow (a) a generalized gamma distribution, (b) a log-skew-normal distribution, and (c) a normal distribution after the Box-Cox transformation. We allow for the possibility of heteroscedasticity. Maximum likelihood estimation is shown to be conveniently implemented in SAS Proc NLMIXED. The performance of the methods is compared through applications to daily drinking records in a secondary data analysis from a randomized controlled trial of topiramate for alcohol dependence treatment. We find that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity. We also compare the three models by the likelihood ratio tests for non-nested hypotheses (Vuong(3)). The results suggest that the generalized gamma distribution provides the best fit, though no statistically significant differences are found in pairwise model comparisons. © The Author(s) 2012.

  7. Coordination of gaze and speech in communication between children with hearing impairment and normal-hearing peers.

    PubMed

    Sandgren, Olof; Andersson, Richard; van de Weijer, Joost; Hansson, Kristina; Sahlén, Birgitta

    2014-06-01

    To investigate gaze behavior during communication between children with hearing impairment (HI) and normal-hearing (NH) peers. Ten HI-NH and 10 NH-NH dyads performed a referential communication task requiring description of faces. During task performance, eye movements and speech were tracked. Using verbal event (questions, statements, back channeling, and silence) as the predictor variable, group characteristics in gaze behavior were expressed with Kaplan-Meier survival functions (estimating time to gaze-to-partner) and odds ratios (comparing number of verbal events with and without gaze-to-partner). Analyses compared the listeners in each dyad (HI: n = 10, mean age = 12;6 years, mean better ear pure-tone average = 33.0 dB HL; NH: n = 10, mean age = 13;7 years). Log-rank tests revealed significant group differences in survival distributions for all verbal events, reflecting a higher probability of gaze to the partner's face for participants with HI. Expressed as odds ratios (OR), participants with HI displayed greater odds for gaze-to-partner (ORs ranging between 1.2 and 2.1) during all verbal events. The results show an increased probability for listeners with HI to gaze at the speaker's face in association with verbal events. Several explanations for the finding are possible, and implications for further research are discussed.

  8. Design and characterization of a cough simulator.

    PubMed

    Zhang, Bo; Zhu, Chao; Ji, Zhiming; Lin, Chao-Hsin

    2017-02-23

    Expiratory droplets from human coughing have always been considered as potential carriers of pathogens, responsible for respiratory infectious disease transmission. To study the transmission of disease by human coughing, a transient repeatable cough simulator has been designed and built. Cough droplets are generated by different mechanisms, such as the breaking of mucus, condensation and high-speed atomization from different depths of the respiratory tract. These mechanisms in coughing produce droplets of different sizes, represented by a bimodal distribution of 'fine' and 'coarse' droplets. A cough simulator is hence designed to generate transient sprays with such bimodal characteristics. It consists of a pressurized gas tank, a nebulizer and an ejector, connected in series, which are controlled by computerized solenoid valves. The bimodal droplet size distribution is characterized for the coarse droplets and fine droplets, by fibrous collection and laser diffraction, respectively. The measured size distributions of coarse and fine droplets are reasonably represented by the Rosin-Rammler and log-normal distributions in probability density function, which leads to a bimodal distribution. To assess the hydrodynamic consequences of coughing including droplet vaporization and polydispersion, a Lagrangian model of droplet trajectories is established, with its ambient flow field predetermined from a computational fluid dynamics simulation.

  9. Characterizing Topology of Probabilistic Biological Networks.

    PubMed

    Todor, Andrei; Dobra, Alin; Kahveci, Tamer

    2013-09-06

    Biological interactions are often uncertain events, that may or may not take place with some probability. Existing studies analyze the degree distribution of biological networks by assuming that all the given interactions take place under all circumstances. This strong and often incorrect assumption can lead to misleading results. Here, we address this problem and develop a sound mathematical basis to characterize networks in the presence of uncertain interactions. We develop a method that accurately describes the degree distribution of such networks. We also extend our method to accurately compute the joint degree distributions of node pairs connected by edges. The number of possible network topologies grows exponentially with the number of uncertain interactions. However, the mathematical model we develop allows us to compute these degree distributions in polynomial time in the number of interactions. It also helps us find an adequate mathematical model using maximum likelihood estimation. Our results demonstrate that power law and log-normal models best describe degree distributions for probabilistic networks. The inverse correlation of degrees of neighboring nodes shows that, in probabilistic networks, nodes with large number of interactions prefer to interact with those with small number of interactions more frequently than expected.

  10. 43 CFR 2812.0-6 - Statement of policy.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... the O. and C. lands presents peculiar problems of management which require for their solution the... significant part by the cost of transporting the logs to the mill. Where there is an existing road which is... capacity to accommodate the probable normal requirements both of the applicant and of the Government and...

  11. Bayesian alternative to the ISO-GUM's use of the Welch Satterthwaite formula

    NASA Astrophysics Data System (ADS)

    Kacker, Raghu N.

    2006-02-01

    In certain disciplines, uncertainty is traditionally expressed as an interval about an estimate for the value of the measurand. Development of such uncertainty intervals with a stated coverage probability based on the International Organization for Standardization (ISO) Guide to the Expression of Uncertainty in Measurement (GUM) requires a description of the probability distribution for the value of the measurand. The ISO-GUM propagates the estimates and their associated standard uncertainties for various input quantities through a linear approximation of the measurement equation to determine an estimate and its associated standard uncertainty for the value of the measurand. This procedure does not yield a probability distribution for the value of the measurand. The ISO-GUM suggests that under certain conditions motivated by the central limit theorem the distribution for the value of the measurand may be approximated by a scaled-and-shifted t-distribution with effective degrees of freedom obtained from the Welch-Satterthwaite (W-S) formula. The approximate t-distribution may then be used to develop an uncertainty interval with a stated coverage probability for the value of the measurand. We propose an approximate normal distribution based on a Bayesian uncertainty as an alternative to the t-distribution based on the W-S formula. A benefit of the approximate normal distribution based on a Bayesian uncertainty is that it greatly simplifies the expression of uncertainty by eliminating altogether the need for calculating effective degrees of freedom from the W-S formula. In the special case where the measurand is the difference between two means, each evaluated from statistical analyses of independent normally distributed measurements with unknown and possibly unequal variances, the probability distribution for the value of the measurand is known to be a Behrens-Fisher distribution. We compare the performance of the approximate normal distribution based on a Bayesian uncertainty and the approximate t-distribution based on the W-S formula with respect to the Behrens-Fisher distribution. The approximate normal distribution is simpler and better in this case. A thorough investigation of the relative performance of the two approximate distributions would require comparison for a range of measurement equations by numerical methods.

  12. Possible Statistics of Two Coupled Random Fields: Application to Passive Scalar

    NASA Technical Reports Server (NTRS)

    Dubrulle, B.; He, Guo-Wei; Bushnell, Dennis M. (Technical Monitor)

    2000-01-01

    We use the relativity postulate of scale invariance to derive the similarity transformations between two coupled scale-invariant random elds at different scales. We nd the equations leading to the scaling exponents. This formulation is applied to the case of passive scalars advected i) by a random Gaussian velocity field; and ii) by a turbulent velocity field. In the Gaussian case, we show that the passive scalar increments follow a log-Levy distribution generalizing Kraichnan's solution and, in an appropriate limit, a log-normal distribution. In the turbulent case, we show that when the velocity increments follow a log-Poisson statistics, the passive scalar increments follow a statistics close to log-Poisson. This result explains the experimental observations of Ruiz et al. about the temperature increments.

  13. Simulations of large acoustic scintillations in the straits of Florida.

    PubMed

    Tang, Xin; Tappert, F D; Creamer, Dennis B

    2006-12-01

    Using a full-wave acoustic model, Monte Carlo numerical studies of intensity fluctuations in a realistic shallow water environment that simulates the Straits of Florida, including internal wave fluctuations and bottom roughness, have been performed. Results show that the sound intensity at distant receivers scintillates dramatically. The acoustic scintillation index SI increases rapidly with propagation range and is significantly greater than unity at ranges beyond about 10 km. This result supports a theoretical prediction by one of the authors. Statistical analyses show that the distribution of intensity of the random wave field saturates to the expected Rayleigh distribution with SI= 1 at short range due to multipath interference effects, and then SI continues to increase to large values. This effect, which is denoted supersaturation, is universal at long ranges in waveguides having lossy boundaries (where there is differential mode attenuation). The intensity distribution approaches a log-normal distribution to an excellent approximation; it may not be a universal distribution and comparison is also made to a K distribution. The long tails of the log-normal distribution cause "acoustic intermittency" in which very high, but rare, intensities occur.

  14. Lognormal Distribution of Cellular Uptake of Radioactivity: Statistical Analysis of α-Particle Track Autoradiography

    PubMed Central

    Neti, Prasad V.S.V.; Howell, Roger W.

    2010-01-01

    Recently, the distribution of radioactivity among a population of cells labeled with 210Po was shown to be well described by a log-normal (LN) distribution function (J Nucl Med. 2006;47:1049–1058) with the aid of autoradiography. To ascertain the influence of Poisson statistics on the interpretation of the autoradiographic data, the present work reports on a detailed statistical analysis of these earlier data. Methods The measured distributions of α-particle tracks per cell were subjected to statistical tests with Poisson, LN, and Poisson-lognormal (P-LN) models. Results The LN distribution function best describes the distribution of radioactivity among cell populations exposed to 0.52 and 3.8 kBq/mL of 210Po-citrate. When cells were exposed to 67 kBq/mL, the P-LN distribution function gave a better fit; however, the underlying activity distribution remained log-normal. Conclusion The present analysis generally provides further support for the use of LN distributions to describe the cellular uptake of radioactivity. Care should be exercised when analyzing autoradiographic data on activity distributions to ensure that Poisson processes do not distort the underlying LN distribution. PMID:18483086

  15. CAN'T MISS--conquer any number task by making important statistics simple. Part 2. Probability, populations, samples, and normal distributions.

    PubMed

    Hansen, John P

    2003-01-01

    Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.

  16. Normal probability plots with confidence.

    PubMed

    Chantarangsi, Wanpen; Liu, Wei; Bretz, Frank; Kiatsupaibul, Seksan; Hayter, Anthony J; Wan, Fang

    2015-01-01

    Normal probability plots are widely used as a statistical tool for assessing whether an observed simple random sample is drawn from a normally distributed population. The users, however, have to judge subjectively, if no objective rule is provided, whether the plotted points fall close to a straight line. In this paper, we focus on how a normal probability plot can be augmented by intervals for all the points so that, if the population distribution is normal, then all the points should fall into the corresponding intervals simultaneously with probability 1-α. These simultaneous 1-α probability intervals provide therefore an objective mean to judge whether the plotted points fall close to the straight line: the plotted points fall close to the straight line if and only if all the points fall into the corresponding intervals. The powers of several normal probability plot based (graphical) tests and the most popular nongraphical Anderson-Darling and Shapiro-Wilk tests are compared by simulation. Based on this comparison, recommendations are given in Section 3 on which graphical tests should be used in what circumstances. An example is provided to illustrate the methods. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. A preliminary analysis of quantifying computer security vulnerability data in "the wild"

    NASA Astrophysics Data System (ADS)

    Farris, Katheryn A.; McNamara, Sean R.; Goldstein, Adam; Cybenko, George

    2016-05-01

    A system of computers, networks and software has some level of vulnerability exposure that puts it at risk to criminal hackers. Presently, most vulnerability research uses data from software vendors, and the National Vulnerability Database (NVD). We propose an alternative path forward through grounding our analysis in data from the operational information security community, i.e. vulnerability data from "the wild". In this paper, we propose a vulnerability data parsing algorithm and an in-depth univariate and multivariate analysis of the vulnerability arrival and deletion process (also referred to as the vulnerability birth-death process). We find that vulnerability arrivals are best characterized by the log-normal distribution and vulnerability deletions are best characterized by the exponential distribution. These distributions can serve as prior probabilities for future Bayesian analysis. We also find that over 22% of the deleted vulnerability data have a rate of zero, and that the arrival vulnerability data is always greater than zero. Finally, we quantify and visualize the dependencies between vulnerability arrivals and deletions through a bivariate scatterplot and statistical observations.

  18. Generating log-normal mock catalog of galaxies in redshift space

    NASA Astrophysics Data System (ADS)

    Agrawal, Aniket; Makiya, Ryu; Chiang, Chi-Ting; Jeong, Donghui; Saito, Shun; Komatsu, Eiichiro

    2017-10-01

    We present a public code to generate a mock galaxy catalog in redshift space assuming a log-normal probability density function (PDF) of galaxy and matter density fields. We draw galaxies by Poisson-sampling the log-normal field, and calculate the velocity field from the linearised continuity equation of matter fields, assuming zero vorticity. This procedure yields a PDF of the pairwise velocity fields that is qualitatively similar to that of N-body simulations. We check fidelity of the catalog, showing that the measured two-point correlation function and power spectrum in real space agree with the input precisely. We find that a linear bias relation in the power spectrum does not guarantee a linear bias relation in the density contrasts, leading to a cross-correlation coefficient of matter and galaxies deviating from unity on small scales. We also find that linearising the Jacobian of the real-to-redshift space mapping provides a poor model for the two-point statistics in redshift space. That is, non-linear redshift-space distortion is dominated by non-linearity in the Jacobian. The power spectrum in redshift space shows a damping on small scales that is qualitatively similar to that of the well-known Fingers-of-God (FoG) effect due to random velocities, except that the log-normal mock does not include random velocities. This damping is a consequence of non-linearity in the Jacobian, and thus attributing the damping of the power spectrum solely to FoG, as commonly done in the literature, is misleading.

  19. Foraging patterns in online searches.

    PubMed

    Wang, Xiangwen; Pleimling, Michel

    2017-03-01

    Nowadays online searches are undeniably the most common form of information gathering, as witnessed by billions of clicks generated each day on search engines. In this work we describe online searches as foraging processes that take place on the semi-infinite line. Using a variety of quantities like probability distributions and complementary cumulative distribution functions of step length and waiting time as well as mean square displacements and entropies, we analyze three different click-through logs that contain the detailed information of millions of queries submitted to search engines. Notable differences between the different logs reveal an increased efficiency of the search engines. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches (i.e., on one page of links provided by the search engines), whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power law distributed. Our investigation of click logs of search engines therefore highlights the presence of intermittent search processes (where phases of local explorations are separated by power law distributed relocation jumps) in online searches. It follows that good search engines enable the users to find the information they are looking for through a local exploration of a single page with search results, whereas for poor search engine users are often forced to do a broader exploration of different pages.

  20. Foraging patterns in online searches

    NASA Astrophysics Data System (ADS)

    Wang, Xiangwen; Pleimling, Michel

    2017-03-01

    Nowadays online searches are undeniably the most common form of information gathering, as witnessed by billions of clicks generated each day on search engines. In this work we describe online searches as foraging processes that take place on the semi-infinite line. Using a variety of quantities like probability distributions and complementary cumulative distribution functions of step length and waiting time as well as mean square displacements and entropies, we analyze three different click-through logs that contain the detailed information of millions of queries submitted to search engines. Notable differences between the different logs reveal an increased efficiency of the search engines. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches (i.e., on one page of links provided by the search engines), whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power law distributed. Our investigation of click logs of search engines therefore highlights the presence of intermittent search processes (where phases of local explorations are separated by power law distributed relocation jumps) in online searches. It follows that good search engines enable the users to find the information they are looking for through a local exploration of a single page with search results, whereas for poor search engine users are often forced to do a broader exploration of different pages.

  1. Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items.

    ERIC Educational Resources Information Center

    van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.

    2002-01-01

    Compared the nominal and empirical null distributions of the standardized log-likelihood statistic for polytomous items for paper-and-pencil (P&P) and computerized adaptive tests (CATs). Results show that the empirical distribution of the statistic differed from the assumed standard normal distribution for both P&P tests and CATs. Also…

  2. Maximum likelihood density modification by pattern recognition of structural motifs

    DOEpatents

    Terwilliger, Thomas C.

    2004-04-13

    An electron density for a crystallographic structure having protein regions and solvent regions is improved by maximizing the log likelihood of a set of structures factors {F.sub.h } using a local log-likelihood function: (x)+p(.rho.(x).vertline.SOLV)p.sub.SOLV (x)+p(.rho.(x).vertline.H)p.sub.H (x)], where p.sub.PROT (x) is the probability that x is in the protein region, p(.rho.(x).vertline.PROT) is the conditional probability for .rho.(x) given that x is in the protein region, and p.sub.SOLV (x) and p(.rho.(x).vertline.SOLV) are the corresponding quantities for the solvent region, p.sub.H (x) refers to the probability that there is a structural motif at a known location, with a known orientation, in the vicinity of the point x; and p(.rho.(x).vertline.H) is the probability distribution for electron density at this point given that the structural motif actually is present. One appropriate structural motif is a helical structure within the crystallographic structure.

  3. A modified weighted function method for parameter estimation of Pearson type three distribution

    NASA Astrophysics Data System (ADS)

    Liang, Zhongmin; Hu, Yiming; Li, Binquan; Yu, Zhongbo

    2014-04-01

    In this paper, an unconventional method called Modified Weighted Function (MWF) is presented for the conventional moment estimation of a probability distribution function. The aim of MWF is to estimate the coefficient of variation (CV) and coefficient of skewness (CS) from the original higher moment computations to the first-order moment calculations. The estimators for CV and CS of Pearson type three distribution function (PE3) were derived by weighting the moments of the distribution with two weight functions, which were constructed by combining two negative exponential-type functions. The selection of these weight functions was based on two considerations: (1) to relate weight functions to sample size in order to reflect the relationship between the quantity of sample information and the role of weight function and (2) to allocate more weights to data close to medium-tail positions in a sample series ranked in an ascending order. A Monte-Carlo experiment was conducted to simulate a large number of samples upon which statistical properties of MWF were investigated. For the PE3 parent distribution, results of MWF were compared to those of the original Weighted Function (WF) and Linear Moments (L-M). The results indicate that MWF was superior to WF and slightly better than L-M, in terms of statistical unbiasness and effectiveness. In addition, the robustness of MWF, WF, and L-M were compared by designing the Monte-Carlo experiment that samples are obtained from Log-Pearson type three distribution (LPE3), three parameter Log-Normal distribution (LN3), and Generalized Extreme Value distribution (GEV), respectively, but all used as samples from the PE3 distribution. The results show that in terms of statistical unbiasness, no one method possesses the absolutely overwhelming advantage among MWF, WF, and L-M, while in terms of statistical effectiveness, the MWF is superior to WF and L-M.

  4. Evaluation of the best fit distribution for partial duration series of daily rainfall in Madinah, western Saudi Arabia

    NASA Astrophysics Data System (ADS)

    Alahmadi, F.; Rahman, N. A.; Abdulrazzak, M.

    2014-09-01

    Rainfall frequency analysis is an essential tool for the design of water related infrastructure. It can be used to predict future flood magnitudes for a given magnitude and frequency of extreme rainfall events. This study analyses the application of rainfall partial duration series (PDS) in the vast growing urban Madinah city located in the western part of Saudi Arabia. Different statistical distributions were applied (i.e. Normal, Log Normal, Extreme Value type I, Generalized Extreme Value, Pearson Type III, Log Pearson Type III) and their distribution parameters were estimated using L-moments methods. Also, different selection criteria models are applied, e.g. Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (AICc), Bayesian Information Criterion (BIC) and Anderson-Darling Criterion (ADC). The analysis indicated the advantage of Generalized Extreme Value as the best fit statistical distribution for Madinah partial duration daily rainfall series. The outcome of such an evaluation can contribute toward better design criteria for flood management, especially flood protection measures.

  5. Algae Tile Data: 2004-2007, BPA-51; Preliminary Report, October 28, 2008.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holderman, Charles

    Multiple files containing 2004 through 2007 Tile Chlorophyll data for the Kootenai River sites designated as: KR1, KR2, KR3, KR4 (Downriver) and KR6, KR7, KR9, KR9.1, KR10, KR11, KR12, KR13, KR14 (Upriver) were received by SCS. For a complete description of the sites covered, please refer to http://ktoi.scsnetw.com. To maintain consistency with the previous SCS algae reports, all analyses were carried out separately for the Upriver and Downriver categories, as defined in the aforementioned paragraph. The Upriver designation, however, now includes three additional sites, KR11, KR12, and the nutrient addition site, KR9.1. Summary statistics and information on the four responses,more » chlorophyll a, chlorophyll a Accrual Rate, Total Chlorophyll, and Total Chlorophyll Accrual Rate are presented in Print Out 2. Computations were carried out separately for each river position (Upriver and Downriver) and year. For example, the Downriver position in 2004 showed an average Chlorophyll a level of 25.5 mg with a standard deviation of 21.4 and minimum and maximum values of 3.1 and 196 mg, respectively. The Upriver data in 2004 showed a lower overall average chlorophyll a level at 2.23 mg with a lower standard deviation (3.6) and minimum and maximum values of (0.13 and 28.7, respectively). A more comprehensive summary of each variable and position is given in Print Out 3. This lists the information above as well as other summary information such as the variance, standard error, various percentiles and extreme values. Using the 2004 Downriver Chlorophyll a as an example again, the variance of this data was 459.3 and the standard error of the mean was 1.55. The median value or 50th percentile was 21.3, meaning 50% of the data fell above and below this value. It should be noted that this value is somewhat different than the mean of 25.5. This is an indication that the frequency distribution of the data is not symmetrical (skewed). The skewness statistic, listed as part of the first section of each analysis, quantifies this. In a symmetric distribution, such as a Normal distribution, the skewness value would be 0. The tile chlorophyll data, however, shows larger values. Chlorophyll a, in the 2004 Downriver example, has a skewness statistic of 3.54, which is quite high. In the last section of the summary analysis, the stem and leaf plot graphically demonstrates the asymmetry, showing most of the data centered around 25 with a large value at 196. The final plot is referred to as a normal probability plot and graphically compares the data to a theoretical normal distribution. For chlorophyll a, the data (asterisks) deviate substantially from the theoretical normal distribution (diagonal reference line of pluses), indicating that the data is non-normal. Other response variables in both the Downriver and Upriver categories also indicated skewed distributions. Because the sample size and mean comparison procedures below require symmetrical, normally distributed data, each response in the data set was logarithmically transformed. The logarithmic transformation, in this case, can help mitigate skewness problems. The summary statistics for the four transformed responses (log-ChlorA, log-TotChlor, and log-accrual ) are given in Print Out 4. For the 2004 Downriver Chlorophyll a data, the logarithmic transformation reduced the skewness value to -0.36 and produced a more bell-shaped symmetric frequency distribution. Similar improvements are shown for the remaining variables and river categories. Hence, all subsequent analyses given below are based on logarithmic transformations of the original responses.« less

  6. Gaussian Quadrature is an efficient method for the back-transformation in estimating the usual intake distribution when assessing dietary exposure.

    PubMed

    Dekkers, A L M; Slob, W

    2012-10-01

    In dietary exposure assessment, statistical methods exist for estimating the usual intake distribution from daily intake data. These methods transform the dietary intake data to normal observations, eliminate the within-person variance, and then back-transform the data to the original scale. We propose Gaussian Quadrature (GQ), a numerical integration method, as an efficient way of back-transformation. We compare GQ with six published methods. One method uses a log-transformation, while the other methods, including GQ, use a Box-Cox transformation. This study shows that, for various parameter choices, the methods with a Box-Cox transformation estimate the theoretical usual intake distributions quite well, although one method, a Taylor approximation, is less accurate. Two applications--on folate intake and fruit consumption--confirmed these results. In one extreme case, some methods, including GQ, could not be applied for low percentiles. We solved this problem by modifying GQ. One method is based on the assumption that the daily intakes are log-normally distributed. Even if this condition is not fulfilled, the log-transformation performs well as long as the within-individual variance is small compared to the mean. We conclude that the modified GQ is an efficient, fast and accurate method for estimating the usual intake distribution. Copyright © 2012 Elsevier Ltd. All rights reserved.

  7. Resistance of virus to extinction on bottleneck passages: study of a decaying and fluctuating pattern of fitness loss

    NASA Technical Reports Server (NTRS)

    Lazaro, Ester; Escarmis, Cristina; Perez-Mercader, Juan; Manrubia, Susanna C.; Domingo, Esteban

    2003-01-01

    RNA viruses display high mutation rates and their populations replicate as dynamic and complex mutant distributions, termed viral quasispecies. Repeated genetic bottlenecks, which experimentally are carried out through serial plaque-to-plaque transfers of the virus, lead to fitness decrease (measured here as diminished capacity to produce infectious progeny). Here we report an analysis of fitness evolution of several low fitness foot-and-mouth disease virus clones subjected to 50 plaque-to-plaque transfers. Unexpectedly, fitness decrease, rather than being continuous and monotonic, displayed a fluctuating pattern, which was influenced by both the virus and the state of the host cell as shown by effects of recent cell passage history. The amplitude of the fluctuations increased as fitness decreased, resulting in a remarkable resistance of virus to extinction. Whereas the frequency distribution of fitness in control (independent) experiments follows a log-normal distribution, the probability of fitness values in the evolving bottlenecked populations fitted a Weibull distribution. We suggest that multiple functions of viral genomic RNA and its encoded proteins, subjected to high mutational pressure, interact with cellular components to produce this nontrivial, fluctuating pattern.

  8. Money-center structures in dynamic banking systems

    NASA Astrophysics Data System (ADS)

    Li, Shouwei; Zhang, Minghui

    2016-10-01

    In this paper, we propose a dynamic model for banking systems based on the description of balance sheets. It generates some features identified through empirical analysis. Through simulation analysis of the model, we find that banking systems have the feature of money-center structures, that bank asset distributions are power-law distributions, and that contract size distributions are log-normal distributions.

  9. Effects of 1.9 MeV monoenergetic neutrons on Vicia faba chromosomes: microdosimetric considerations.

    PubMed

    Geard, C R

    1980-01-01

    Aerated Vicia faba root meristems were irradiated with 1.9 MeV monoenergetic neutrons. This source of neutrons optimally provides one class of particles (recoil protons) with ranges able to traverse cell nuclei at moderate to high-LET. The volumes of the Vicia faba nuclei were log-normally distributed with a mean of 1100 micrometer3. The yield of chromatid-type aberrations was linear against absorbed dose and near-constant over 5 collection periods (2-12 h), after irradiation. Energy deposition events (recoil protons) determined by microdosimetry were related to cytological changes with the finding that 19% of incident recoil protons initiate visible changes in Vicia faba chromosomes. It is probable that a substantial fraction of recoil proton track length and deposited energy is in insensitive (non-DNA containing) portions of the nuclear volume.

  10. Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.

    PubMed

    Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A

    2006-10-15

    Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.

  11. Single-trial log transformation is optimal in frequency analysis of resting EEG alpha.

    PubMed

    Smulders, Fren T Y; Ten Oever, Sanne; Donkers, Franc C L; Quaedflieg, Conny W E M; van de Ven, Vincent

    2018-02-01

    The appropriate definition and scaling of the magnitude of electroencephalogram (EEG) oscillations is an underdeveloped area. The aim of this study was to optimize the analysis of resting EEG alpha magnitude, focusing on alpha peak frequency and nonlinear transformation of alpha power. A family of nonlinear transforms, Box-Cox transforms, were applied to find the transform that (a) maximized a non-disputed effect: the increase in alpha magnitude when the eyes are closed (Berger effect), and (b) made the distribution of alpha magnitude closest to normal across epochs within each participant, or across participants. The transformations were performed either at the single epoch level or at the epoch-average level. Alpha peak frequency showed large individual differences, yet good correspondence between various ways to estimate it in 2 min of eyes-closed and 2 min of eyes-open resting EEG data. Both alpha magnitude and the Berger effect were larger for individual alpha than for a generic (8-12 Hz) alpha band. The log-transform on single epochs (a) maximized the t-value of the contrast between the eyes-open and eyes-closed conditions when tested within each participant, and (b) rendered near-normally distributed alpha power across epochs and participants, thereby making further transformation of epoch averages superfluous. The results suggest that the log-normal distribution is a fundamental property of variations in alpha power across time in the order of seconds. Moreover, effects on alpha power appear to be multiplicative rather than additive. These findings support the use of the log-transform on single epochs to achieve appropriate scaling of alpha magnitude. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  12. Exponential series approaches for nonparametric graphical models

    NASA Astrophysics Data System (ADS)

    Janofsky, Eric

    Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.

  13. Novel bayes factors that capture expert uncertainty in prior density specification in genetic association studies.

    PubMed

    Spencer, Amy V; Cox, Angela; Lin, Wei-Yu; Easton, Douglas F; Michailidou, Kyriaki; Walters, Kevin

    2015-05-01

    Bayes factors (BFs) are becoming increasingly important tools in genetic association studies, partly because they provide a natural framework for including prior information. The Wakefield BF (WBF) approximation is easy to calculate and assumes a normal prior on the log odds ratio (logOR) with a mean of zero. However, the prior variance (W) must be specified. Because of the potentially high sensitivity of the WBF to the choice of W, we propose several new BF approximations with logOR ∼N(0,W), but allow W to take a probability distribution rather than a fixed value. We provide several prior distributions for W which lead to BFs that can be calculated easily in freely available software packages. These priors allow a wide range of densities for W and provide considerable flexibility. We examine some properties of the priors and BFs and show how to determine the most appropriate prior based on elicited quantiles of the prior odds ratio (OR). We show by simulation that our novel BFs have superior true-positive rates at low false-positive rates compared to those from both P-value and WBF analyses across a range of sample sizes and ORs. We give an example of utilizing our BFs to fine-map the CASP8 region using genotype data on approximately 46,000 breast cancer case and 43,000 healthy control samples from the Collaborative Oncological Gene-environment Study (COGS) Consortium, and compare the single-nucleotide polymorphism ranks to those obtained using WBFs and P-values from univariate logistic regression. © 2015 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.

  14. 2D modelling of the light distribution of early-type galaxies in a volume-limited sample - II. Results for real galaxies

    NASA Astrophysics Data System (ADS)

    D'Onofrio, M.

    2001-10-01

    In this paper we analyse the results of the two-dimensional (2D) fit of the light distribution of 73 early-type galaxies belonging to the Virgo and Fornax clusters, a sample volume- and magnitude-limited down to MB=-17.3, and highly homogeneous. In our previous paper (Paper I) we have presented the adopted 2D models of the surface-brightness distribution - namely the r1/n and (r1/n+exp) models - we have discussed the main sources of error affecting the structural parameters, and we have tested the ability of the chosen minimization algorithm (MINUIT) in determining the fitting parameters using a sample of artificial galaxies. We show that, with the exception of 11 low-luminosity E galaxies, the best fit of the real galaxy sample is always achieved with the two-component (r1/n+exp) model. The improvement in the χ2 due to the addition of the exponential component is found to be statistically significant. The best fit is obtained with the exponent n of the generalized r1/n Sersic law different from the classical de Vaucouleurs value of 4. Nearly 42 per cent of the sample have n<2, suggesting the presence of exponential `bulges' also in early-type galaxies. 20 luminous E galaxies are fitted by the two-component model, with a small central exponential structure (`disc') and an outer big spheroid with n>4. We believe that this is probably due to their resolved core. The resulting scalelengths Rh and Re of each component peak approximately at ~1 and ~2kpc, respectively, although with different variances in their distributions. The ratio Re/Rh peaks at ~0.5, a value typical for normal lenticular galaxies. The first component, represented by the r1/n law, is probably made of two distinct families, `ordinary' and `bright', on the basis of their distribution in the μe-log(Re) plane, a result already suggested by Capaccioli, Caon and D'Onofrio. The bulges of spirals and S0 galaxies belong to the `ordinary' family, while the large spheroids of luminous E galaxies form the `bright' family. The second component, represented by the exponential law, also shows a wide distribution in the μ0c-log(Rh) plane. Small discs (or cores) have short scalelengths and high central surface brightness, while normal lenticulars and spiral galaxies generally have scalelengths higher than 0.5kpc and central surface brightness brighter than 20magarcsec-2 (in the B band). The scalelengths Re and Rh of the `bulge' and `disc' components are probably correlated, indicating that a self-regulating mechanism of galaxy formation may be at work. Alternatively, two regions of the Re-Rh plane are avoided by galaxies due to dynamical instability effects. The bulge-to-disc (B/D) ratio seems to vary uniformly along the Hubble sequence, going from late-type spirals to E galaxies. At the end of the sequence the ratio between the large spheroidal component and the small inner core can reach B/D~100.

  15. Methods and results of peak-flow frequency analyses for streamgages in and bordering Minnesota, through water year 2011

    USGS Publications Warehouse

    Kessler, Erich W.; Lorenz, David L.; Sanocki, Christopher A.

    2013-01-01

    Peak-flow frequency analyses were completed for 409 streamgages in and bordering Minnesota having at least 10 systematic peak flows through water year 2011. Selected annual exceedance probabilities were determined by fitting a log-Pearson type III probability distribution to the recorded annual peak flows. A detailed explanation of the methods that were used to determine the annual exceedance probabilities, the historical period, acceptable low outliers, and analysis method for each streamgage are presented. The final results of the analyses are presented.

  16. A spatial scan statistic for survival data based on Weibull distribution.

    PubMed

    Bhatt, Vijaya; Tiwari, Neeraj

    2014-05-20

    The spatial scan statistic has been developed as a geographical cluster detection analysis tool for different types of data sets such as Bernoulli, Poisson, ordinal, normal and exponential. We propose a scan statistic for survival data based on Weibull distribution. It may also be used for other survival distributions, such as exponential, gamma, and log normal. The proposed method is applied on the survival data of tuberculosis patients for the years 2004-2005 in Nainital district of Uttarakhand, India. Simulation studies reveal that the proposed method performs well for different survival distribution functions. Copyright © 2013 John Wiley & Sons, Ltd.

  17. Generalized t-statistic for two-group classification.

    PubMed

    Komori, Osamu; Eguchi, Shinto; Copas, John B

    2015-06-01

    In the classic discriminant model of two multivariate normal distributions with equal variance matrices, the linear discriminant function is optimal both in terms of the log likelihood ratio and in terms of maximizing the standardized difference (the t-statistic) between the means of the two distributions. In a typical case-control study, normality may be sensible for the control sample but heterogeneity and uncertainty in diagnosis may suggest that a more flexible model is needed for the cases. We generalize the t-statistic approach by finding the linear function which maximizes a standardized difference but with data from one of the groups (the cases) filtered by a possibly nonlinear function U. We study conditions for consistency of the method and find the function U which is optimal in the sense of asymptotic efficiency. Optimality may also extend to other measures of discriminatory efficiency such as the area under the receiver operating characteristic curve. The optimal function U depends on a scalar probability density function which can be estimated non-parametrically using a standard numerical algorithm. A lasso-like version for variable selection is implemented by adding L1-regularization to the generalized t-statistic. Two microarray data sets in the study of asthma and various cancers are used as motivating examples. © 2014, The International Biometric Society.

  18. Generating log-normal mock catalog of galaxies in redshift space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Agrawal, Aniket; Makiya, Ryu; Saito, Shun

    We present a public code to generate a mock galaxy catalog in redshift space assuming a log-normal probability density function (PDF) of galaxy and matter density fields. We draw galaxies by Poisson-sampling the log-normal field, and calculate the velocity field from the linearised continuity equation of matter fields, assuming zero vorticity. This procedure yields a PDF of the pairwise velocity fields that is qualitatively similar to that of N-body simulations. We check fidelity of the catalog, showing that the measured two-point correlation function and power spectrum in real space agree with the input precisely. We find that a linear biasmore » relation in the power spectrum does not guarantee a linear bias relation in the density contrasts, leading to a cross-correlation coefficient of matter and galaxies deviating from unity on small scales. We also find that linearising the Jacobian of the real-to-redshift space mapping provides a poor model for the two-point statistics in redshift space. That is, non-linear redshift-space distortion is dominated by non-linearity in the Jacobian. The power spectrum in redshift space shows a damping on small scales that is qualitatively similar to that of the well-known Fingers-of-God (FoG) effect due to random velocities, except that the log-normal mock does not include random velocities. This damping is a consequence of non-linearity in the Jacobian, and thus attributing the damping of the power spectrum solely to FoG, as commonly done in the literature, is misleading.« less

  19. Parametric modelling of cost data in medical studies.

    PubMed

    Nixon, R M; Thompson, S G

    2004-04-30

    The cost of medical resources used is often recorded for each patient in clinical studies in order to inform decision-making. Although cost data are generally skewed to the right, interest is in making inferences about the population mean cost. Common methods for non-normal data, such as data transformation, assuming asymptotic normality of the sample mean or non-parametric bootstrapping, are not ideal. This paper describes possible parametric models for analysing cost data. Four example data sets are considered, which have different sample sizes and degrees of skewness. Normal, gamma, log-normal, and log-logistic distributions are fitted, together with three-parameter versions of the latter three distributions. Maximum likelihood estimates of the population mean are found; confidence intervals are derived by a parametric BC(a) bootstrap and checked by MCMC methods. Differences between model fits and inferences are explored.Skewed parametric distributions fit cost data better than the normal distribution, and should in principle be preferred for estimating the population mean cost. However for some data sets, we find that models that fit badly can give similar inferences to those that fit well. Conversely, particularly when sample sizes are not large, different parametric models that fit the data equally well can lead to substantially different inferences. We conclude that inferences are sensitive to choice of statistical model, which itself can remain uncertain unless there is enough data to model the tail of the distribution accurately. Investigating the sensitivity of conclusions to choice of model should thus be an essential component of analysing cost data in practice. Copyright 2004 John Wiley & Sons, Ltd.

  20. Bidisperse and polydisperse suspension rheology at large solid fraction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pednekar, Sidhant; Chun, Jaehun; Morris, Jeffrey F.

    At the same solid volume fraction, bidisperse and polydisperse suspensions display lower viscosities, and weaker normal stress response, compared to monodisperse suspensions. The reduction of viscosity associated with size distribution can be explained by an increase of the maximum flowable, or jamming, solid fraction. In this work, concentrated or "dense" suspensions are simulated under strong shearing, where thermal motion and repulsive forces are negligible, but we allow for particle contact with a mild frictional interaction with interparticle friction coefficient of 0.2. Aspects of bidisperse suspension rheology are first revisited to establish that the approach reproduces established trends; the study ofmore » bidisperse suspensions at size ratios of large to small particle radii (2 to 4) shows that a minimum in the viscosity occurs for zeta slightly above 0.5, where zeta=phi_{large}/phi is the fraction of the total solid volume occupied by the large particles. The simple shear flows of polydisperse suspensions with truncated normal and log normal size distributions, and bidisperse suspensions which are statistically equivalent with these polydisperse cases up to third moment of the size distribution, are simulated and the rheologies are extracted. Prior work shows that such distributions with equivalent low-order moments have similar phi_{m}, and the rheological behaviors of normal, log normal and bidisperse cases are shown to be in close agreement for a wide range of standard deviation in particle size, with standard correlations which are functionally dependent on phi/phi_{m} providing excellent agreement with the rheology found in simulation. The close agreement of both viscosity and normal stress response between bi- and polydisperse suspensions demonstrates the controlling in influence of the maximum packing fraction in noncolloidal suspensions. Microstructural investigations and the stress distribution according to particle size are also presented.« less

  1. Randomized path optimization for thevMitigated counter detection of UAVS

    DTIC Science & Technology

    2017-06-01

    using Bayesian filtering . The KL divergence is used to compare the probability density of aircraft termination to a normal distribution around the...Bayesian filtering . The KL divergence is used to compare the probability density of aircraft termination to a normal distribution around the true terminal...algorithm’s success. A recursive Bayesian filtering scheme is used to assimilate noisy measurements of the UAVs position to predict its terminal location. We

  2. Channel characterization and empirical model for ergodic capacity of free-space optical communication link

    NASA Astrophysics Data System (ADS)

    Alimi, Isiaka; Shahpari, Ali; Ribeiro, Vítor; Sousa, Artur; Monteiro, Paulo; Teixeira, António

    2017-05-01

    In this paper, we present experimental results on channel characterization of single input single output (SISO) free-space optical (FSO) communication link that is based on channel measurements. The histograms of the FSO channel samples and the log-normal distribution fittings are presented along with the measured scintillation index. Furthermore, we extend our studies to diversity schemes and propose a closed-form expression for determining ergodic channel capacity of multiple input multiple output (MIMO) FSO communication systems over atmospheric turbulence fading channels. The proposed empirical model is based on SISO FSO channel characterization. Also, the scintillation effects on the system performance are analyzed and results for different turbulence conditions are presented. Moreover, we observed that the histograms of the FSO channel samples that we collected from a 1548.51 nm link have good fits with log-normal distributions and the proposed model for MIMO FSO channel capacity is in conformity with the simulation results in terms of normalized mean-square error (NMSE).

  3. On the scaling of the distribution of daily price fluctuations in the Mexican financial market index

    NASA Astrophysics Data System (ADS)

    Alfonso, Léster; Mansilla, Ricardo; Terrero-Escalante, César A.

    2012-05-01

    In this paper, a statistical analysis of log-return fluctuations of the IPC, the Mexican Stock Market Index is presented. A sample of daily data covering the period from 04/09/2000-04/09/2010 was analyzed, and fitted to different distributions. Tests of the goodness of fit were performed in order to quantitatively asses the quality of the estimation. Special attention was paid to the impact of the size of the sample on the estimated decay of the distributions tail. In this study a forceful rejection of normality was obtained. On the other hand, the null hypothesis that the log-fluctuations are fitted to a α-stable Lévy distribution cannot be rejected at the 5% significance level.

  4. Growth models and the expected distribution of fluctuating asymmetry

    USGS Publications Warehouse

    Graham, John H.; Shimizu, Kunio; Emlen, John M.; Freeman, D. Carl; Merkel, John

    2003-01-01

    Multiplicative error accounts for much of the size-scaling and leptokurtosis in fluctuating asymmetry. It arises when growth involves the addition of tissue to that which is already present. Such errors are lognormally distributed. The distribution of the difference between two lognormal variates is leptokurtic. If those two variates are correlated, then the asymmetry variance will scale with size. Inert tissues typically exhibit additive error and have a gamma distribution. Although their asymmetry variance does not exhibit size-scaling, the distribution of the difference between two gamma variates is nevertheless leptokurtic. Measurement error is also additive, but has a normal distribution. Thus, the measurement of fluctuating asymmetry may involve the mixing of additive and multiplicative error. When errors are multiplicative, we recommend computing log E(l) − log E(r), the difference between the logarithms of the expected values of left and right sides, even when size-scaling is not obvious. If l and r are lognormally distributed, and measurement error is nil, the resulting distribution will be normal, and multiplicative error will not confound size-related changes in asymmetry. When errors are additive, such a transformation to remove size-scaling is unnecessary. Nevertheless, the distribution of l − r may still be leptokurtic.

  5. Statistical analysis of variability properties of the Kepler blazar W2R 1926+42

    NASA Astrophysics Data System (ADS)

    Li, Yutong; Hu, Shaoming; Wiita, Paul J.; Gupta, Alok C.

    2018-04-01

    We analyzed Kepler light curves of the blazar W2R 1926+42 that provided nearly continuous coverage from quarter 11 through quarter 17 (589 days between 2011 and 2013) and examined some of their flux variability properties. We investigate the possibility that the light curve is dominated by a large number of individual flares and adopt exponential rise and decay models to investigate the symmetry properties of flares. We found that those variations of W2R 1926+42 are predominantly asymmetric with weak tendencies toward positive asymmetry (rapid rise and slow decay). The durations (D) and the amplitudes (F0) of flares can be fit with log-normal distributions. The energy (E) of each flare is also estimated for the first time. There are positive correlations between logD and logE with a slope of 1.36, and between logF0 and logE with a slope of 1.12. Lomb-Scargle periodograms are used to estimate the power spectral density (PSD) shape. It is well described by a power law with an index ranging between -1.1 and -1.5. The sizes of the emission regions, R, are estimated to be in the range of 1.1 × 1015cm - 6.6 × 1016cm. The flare asymmetry is difficult to explain by a light travel time effect but may be caused by differences between the timescales for acceleration and dissipation of high-energy particles in the relativistic jet. A jet-in-jet model also could produce the observed log-normal distributions.

  6. On the null distribution of Bayes factors in linear regression

    USDA-ARS?s Scientific Manuscript database

    We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...

  7. Species-abundance distribution patterns of soil fungi: contribution to the ecological understanding of their response to experimental fire in Mediterranean maquis (southern Italy).

    PubMed

    Persiani, Anna Maria; Maggi, Oriana

    2013-01-01

    Experimental fires, of both low and high intensity, were lit during summer 2000 and the following 2 y in the Castel Volturno Nature Reserve, southern Italy. Soil samples were collected Jul 2000-Jul 2002 to analyze the soil fungal community dynamics. Species abundance distribution patterns (geometric, logarithmic, log normal, broken-stick) were compared. We plotted datasets with information both on species richness and abundance for total, xerotolerant and heat-stimulated soil microfungi. The xerotolerant fungi conformed to a broken-stick model for both the low- and high intensity fires at 7 and 84 d after the fire; their distribution subsequently followed logarithmic models in the 2 y following the fire. The distribution of the heat-stimulated fungi changed from broken-stick to logarithmic models and eventually to a log-normal model during the post-fire recovery. Xerotolerant and, to a far greater extent, heat-stimulated soil fungi acquire an important functional role following soil water stress and/or fire disturbance; these disturbances let them occupy unsaturated habitats and become increasingly abundant over time.

  8. On the Log-Normality of Historical Magnetic-Storm Intensity Statistics: Implications for Extreme-Event Probabilities

    NASA Astrophysics Data System (ADS)

    Love, J. J.; Rigler, E. J.; Pulkkinen, A. A.; Riley, P.

    2015-12-01

    An examination is made of the hypothesis that the statistics of magnetic-storm-maximum intensities are the realization of a log-normal stochastic process. Weighted least-squares and maximum-likelihood methods are used to fit log-normal functions to -Dst storm-time maxima for years 1957-2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum-likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least-squares. From extrapolation of maximum-likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, -Dst > 850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42, 2.41] times per century; a 100-yr magnetic storm is identified as having a -Dst > 880 nT (greater than Carrington) but a wide 95% confidence interval of [490, 1187] nT. This work is partially motivated by United States National Science and Technology Council and Committee on Space Research and International Living with a Star priorities and strategic plans for the assessment and mitigation of space-weather hazards.

  9. Design of a sampling plan to detect ochratoxin A in green coffee.

    PubMed

    Vargas, E A; Whitaker, T B; Dos Santos, E A; Slate, A B; Lima, F B; Franca, R C A

    2006-01-01

    The establishment of maximum limits for ochratoxin A (OTA) in coffee by importing countries requires that coffee-producing countries develop scientifically based sampling plans to assess OTA contents in lots of green coffee before coffee enters the market thus reducing consumer exposure to OTA, minimizing the number of lots rejected, and reducing financial loss for producing countries. A study was carried out to design an official sampling plan to determine OTA in green coffee produced in Brazil. Twenty-five lots of green coffee (type 7 - approximately 160 defects) were sampled according to an experimental protocol where 16 test samples were taken from each lot (total of 16 kg) resulting in a total of 800 OTA analyses. The total, sampling, sample preparation, and analytical variances were 10.75 (CV = 65.6%), 7.80 (CV = 55.8%), 2.84 (CV = 33.7%), and 0.11 (CV = 6.6%), respectively, assuming a regulatory limit of 5 microg kg(-1) OTA and using a 1 kg sample, Romer RAS mill, 25 g sub-samples, and high performance liquid chromatography. The observed OTA distribution among the 16 OTA sample results was compared to several theoretical distributions. The 2 parameter-log normal distribution was selected to model OTA test results for green coffee as it gave the best fit across all 25 lot distributions. Specific computer software was developed using the variance and distribution information to predict the probability of accepting or rejecting coffee lots at specific OTA concentrations. The acceptation probability was used to compute an operating characteristic (OC) curve specific to a sampling plan design. The OC curve was used to predict the rejection of good lots (sellers' or exporters' risk) and the acceptance of bad lots (buyers' or importers' risk).

  10. Thermal and log-normal distributions of plasma in laser driven Coulomb explosions of deuterium clusters

    NASA Astrophysics Data System (ADS)

    Barbarino, M.; Warrens, M.; Bonasera, A.; Lattuada, D.; Bang, W.; Quevedo, H. J.; Consoli, F.; de Angelis, R.; Andreoli, P.; Kimura, S.; Dyer, G.; Bernstein, A. C.; Hagel, K.; Barbui, M.; Schmidt, K.; Gaul, E.; Donovan, M. E.; Natowitz, J. B.; Ditmire, T.

    2016-08-01

    In this work, we explore the possibility that the motion of the deuterium ions emitted from Coulomb cluster explosions is highly disordered enough to resemble thermalization. We analyze the process of nuclear fusion reactions driven by laser-cluster interactions in experiments conducted at the Texas Petawatt laser facility using a mixture of D2+3He and CD4+3He cluster targets. When clusters explode by Coulomb repulsion, the emission of the energetic ions is “nearly” isotropic. In the framework of cluster Coulomb explosions, we analyze the energy distributions of the ions using a Maxwell-Boltzmann (MB) distribution, a shifted MB distribution (sMB), and the energy distribution derived from a log-normal (LN) size distribution of clusters. We show that the first two distributions reproduce well the experimentally measured ion energy distributions and the number of fusions from d-d and d-3He reactions. The LN distribution is a good representation of the ion kinetic energy distribution well up to high momenta where the noise becomes dominant, but overestimates both the neutron and the proton yields. If the parameters of the LN distributions are chosen to reproduce the fusion yields correctly, the experimentally measured high energy ion spectrum is not well represented. We conclude that the ion kinetic energy distribution is highly disordered and practically not distinguishable from a thermalized one.

  11. A brief introduction to probability.

    PubMed

    Di Paola, Gioacchino; Bertani, Alessandro; De Monte, Lavinia; Tuzzolino, Fabio

    2018-02-01

    The theory of probability has been debated for centuries: back in 1600, French mathematics used the rules of probability to place and win bets. Subsequently, the knowledge of probability has significantly evolved and is now an essential tool for statistics. In this paper, the basic theoretical principles of probability will be reviewed, with the aim of facilitating the comprehension of statistical inference. After a brief general introduction on probability, we will review the concept of the "probability distribution" that is a function providing the probabilities of occurrence of different possible outcomes of a categorical or continuous variable. Specific attention will be focused on normal distribution that is the most relevant distribution applied to statistical analysis.

  12. Analysis of field size distributions, LACIE test sites 5029, 5033, and 5039, Anhwei Province, People's Republic of China

    NASA Technical Reports Server (NTRS)

    Podwysocki, M. H.

    1976-01-01

    A study was made of the field size distributions for LACIE test sites 5029, 5033, and 5039, People's Republic of China. Field lengths and widths were measured from LANDSAT imagery, and field area was statistically modeled. Field size parameters have log-normal or Poisson frequency distributions. These were normalized to the Gaussian distribution and theoretical population curves were made. When compared to fields in other areas of the same country measured in the previous study, field lengths and widths in the three LACIE test sites were 2 to 3 times smaller and areas were smaller by an order of magnitude.

  13. Valid analytical performance specifications for combined analytical bias and imprecision for the use of common reference intervals.

    PubMed

    Hyltoft Petersen, Per; Lund, Flemming; Fraser, Callum G; Sandberg, Sverre; Sölétormos, György

    2018-01-01

    Background Many clinical decisions are based on comparison of patient results with reference intervals. Therefore, an estimation of the analytical performance specifications for the quality that would be required to allow sharing common reference intervals is needed. The International Federation of Clinical Chemistry (IFCC) recommended a minimum of 120 reference individuals to establish reference intervals. This number implies a certain level of quality, which could then be used for defining analytical performance specifications as the maximum combination of analytical bias and imprecision required for sharing common reference intervals, the aim of this investigation. Methods Two methods were investigated for defining the maximum combination of analytical bias and imprecision that would give the same quality of common reference intervals as the IFCC recommendation. Method 1 is based on a formula for the combination of analytical bias and imprecision and Method 2 is based on the Microsoft Excel formula NORMINV including the fractional probability of reference individuals outside each limit and the Gaussian variables of mean and standard deviation. The combinations of normalized bias and imprecision are illustrated for both methods. The formulae are identical for Gaussian and log-Gaussian distributions. Results Method 2 gives the correct results with a constant percentage of 4.4% for all combinations of bias and imprecision. Conclusion The Microsoft Excel formula NORMINV is useful for the estimation of analytical performance specifications for both Gaussian and log-Gaussian distributions of reference intervals.

  14. Problems with Using the Normal Distribution – and Ways to Improve Quality and Efficiency of Data Analysis

    PubMed Central

    Limpert, Eckhard; Stahel, Werner A.

    2011-01-01

    Background The Gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by ± SD, or with the standard error of the mean, ± SEM. This, together with corresponding bars in graphical displays has become the standard to characterize variation. Methodology/Principal Findings Here we question the adequacy of this characterization, and of the model. The published literature provides numerous examples for which such descriptions appear inappropriate because, based on the “95% range check”, their distributions are obviously skewed. In these cases, the symmetric characterization is a poor description and may trigger wrong conclusions. To solve the problem, it is enlightening to regard causes of variation. Multiplicative causes are by far more important than additive ones, in general, and benefit from a multiplicative (or log-) normal approach. Fortunately, quite similar to the normal, the log-normal distribution can now be handled easily and characterized at the level of the original data with the help of both, a new sign, x/, times-divide, and notation. Analogous to ± SD, it connects the multiplicative (or geometric) mean * and the multiplicative standard deviation s* in the form * x/s*, that is advantageous and recommended. Conclusions/Significance The corresponding shift from the symmetric to the asymmetric view will substantially increase both, recognition of data distributions, and interpretation quality. It will allow for savings in sample size that can be considerable. Moreover, this is in line with ethical responsibility. Adequate models will improve concepts and theories, and provide deeper insight into science and life. PMID:21779325

  15. Problems with using the normal distribution--and ways to improve quality and efficiency of data analysis.

    PubMed

    Limpert, Eckhard; Stahel, Werner A

    2011-01-01

    The gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by mean ± SD, or with the standard error of the mean, mean ± SEM. This, together with corresponding bars in graphical displays has become the standard to characterize variation. Here we question the adequacy of this characterization, and of the model. The published literature provides numerous examples for which such descriptions appear inappropriate because, based on the "95% range check", their distributions are obviously skewed. In these cases, the symmetric characterization is a poor description and may trigger wrong conclusions. To solve the problem, it is enlightening to regard causes of variation. Multiplicative causes are by far more important than additive ones, in general, and benefit from a multiplicative (or log-) normal approach. Fortunately, quite similar to the normal, the log-normal distribution can now be handled easily and characterized at the level of the original data with the help of both, a new sign, x/, times-divide, and notation. Analogous to mean ± SD, it connects the multiplicative (or geometric) mean mean * and the multiplicative standard deviation s* in the form mean * x/s*, that is advantageous and recommended. The corresponding shift from the symmetric to the asymmetric view will substantially increase both, recognition of data distributions, and interpretation quality. It will allow for savings in sample size that can be considerable. Moreover, this is in line with ethical responsibility. Adequate models will improve concepts and theories, and provide deeper insight into science and life.

  16. Occurrence of Toxic Cyanobacterial Blooms in Rio de la Plata Estuary, Argentina: Field Study and Data Analysis

    PubMed Central

    Giannuzzi, L.; Carvajal, G.; Corradini, M. G.; Araujo Andrade, C.; Echenique, R.; Andrinolo, D.

    2012-01-01

    Water samples were collected during 3 years (2004–2007) at three sampling sites in the Rio de la Plata estuary. Thirteen biological, physical, and chemical parameters were determined on the water samples. The presence of microcystin-LR in the reservoir samples, and also in domestic water samples, was confirmed and quantified. Microcystin-LR concentration ranged between 0.02 and 8.6 μg.L−1. Principal components analysis was used to identify the factors promoting cyanobacteria growth. The proliferation of cyanobacteria was accompanied by the presence of high total and fecal coliforms bacteria (>1500 MNP/100 mL), temperature ≥25°C, and total phosphorus content ≥1.24 mg·L−1. The observed fluctuating patterns of Microcystis aeruginosa, total coliforms, and Microcystin-LR were also described by probabilistic models based on the log-normal and extreme value distributions. The sampling sites were compared in terms of the distribution parameters and the probability of observing high concentrations for Microcystis aeruginosa, total coliforms, and microcystin-LR concentration. PMID:22523486

  17. Positive phase space distributions and uncertainty relations

    NASA Technical Reports Server (NTRS)

    Kruger, Jan

    1993-01-01

    In contrast to a widespread belief, Wigner's theorem allows the construction of true joint probabilities in phase space for distributions describing the object system as well as for distributions depending on the measurement apparatus. The fundamental role of Heisenberg's uncertainty relations in Schroedinger form (including correlations) is pointed out for these two possible interpretations of joint probability distributions. Hence, in order that a multivariate normal probability distribution in phase space may correspond to a Wigner distribution of a pure or a mixed state, it is necessary and sufficient that Heisenberg's uncertainty relation in Schroedinger form should be satisfied.

  18. Probability analysis for consecutive-day maximum rainfall for Tiruchirapalli City (south India, Asia)

    NASA Astrophysics Data System (ADS)

    Sabarish, R. Mani; Narasimhan, R.; Chandhru, A. R.; Suribabu, C. R.; Sudharsan, J.; Nithiyanantham, S.

    2017-05-01

    In the design of irrigation and other hydraulic structures, evaluating the magnitude of extreme rainfall for a specific probability of occurrence is of much importance. The capacity of such structures is usually designed to cater to the probability of occurrence of extreme rainfall during its lifetime. In this study, an extreme value analysis of rainfall for Tiruchirapalli City in Tamil Nadu was carried out using 100 years of rainfall data. Statistical methods were used in the analysis. The best-fit probability distribution was evaluated for 1, 2, 3, 4 and 5 days of continuous maximum rainfall. The goodness of fit was evaluated using Chi-square test. The results of the goodness-of-fit tests indicate that log-Pearson type III method is the overall best-fit probability distribution for 1-day maximum rainfall and consecutive 2-, 3-, 4-, 5- and 6-day maximum rainfall series of Tiruchirapalli. To be reliable, the forecasted maximum rainfalls for the selected return periods are evaluated in comparison with the results of the plotting position.

  19. Beyond the power law: Uncovering stylized facts in interbank networks

    NASA Astrophysics Data System (ADS)

    Vandermarliere, Benjamin; Karas, Alexei; Ryckebusch, Jan; Schoors, Koen

    2015-06-01

    We use daily data on bilateral interbank exposures and monthly bank balance sheets to study network characteristics of the Russian interbank market over August 1998-October 2004. Specifically, we examine the distributions of (un)directed (un)weighted degree, nodal attributes (bank assets, capital and capital-to-assets ratio) and edge weights (loan size and counterparty exposure). We search for the theoretical distribution that fits the data best and report the "best" fit parameters. We observe that all studied distributions are heavy tailed. The fat tail typically contains 20% of the data and can be mostly described well by a truncated power law. Also the power law, stretched exponential and log-normal provide reasonably good fits to the tails of the data. In most cases, however, separating the bulk and tail parts of the data is hard, so we proceed to study the full range of the events. We find that the stretched exponential and the log-normal distributions fit the full range of the data best. These conclusions are robust to (1) whether we aggregate the data over a week, month, quarter or year; (2) whether we look at the "growth" versus "maturity" phases of interbank market development; and (3) with minor exceptions, whether we look at the "normal" versus "crisis" operation periods. In line with prior research, we find that the network topology changes greatly as the interbank market moves from a "normal" to a "crisis" operation period.

  20. Estimation of Microbial Concentration in Food Products from Qualitative, Microbiological Test Data with the MPN Technique.

    PubMed

    Fujikawa, Hiroshi

    2017-01-01

    Microbial concentration in samples of a food product lot has been generally assumed to follow the log-normal distribution in food sampling, but this distribution cannot accommodate the concentration of zero. In the present study, first, a probabilistic study with the most probable number (MPN) technique was done for a target microbe present at a low (or zero) concentration in food products. Namely, based on the number of target pathogen-positive samples in the total samples of a product found by a qualitative, microbiological examination, the concentration of the pathogen in the product was estimated by means of the MPN technique. The effects of the sample size and the total sample number of a product were then examined. Second, operating characteristic (OC) curves for the concentration of a target microbe in a product lot were generated on the assumption that the concentration of a target microbe could be expressed with the Poisson distribution. OC curves for Salmonella and Cronobacter sakazakii in powdered formulae for infants and young children were successfully generated. The present study suggested that the MPN technique and the Poisson distribution would be useful for qualitative microbiological test data analysis for a target microbe whose concentration in a lot is expected to be low.

  1. Predicting the probability of slip in gait: methodology and distribution study.

    PubMed

    Gragg, Jared; Yang, James

    2016-01-01

    The likelihood of a slip is related to the available and required friction for a certain activity, here gait. Classical slip and fall analysis presumed that a walking surface was safe if the difference between the mean available and required friction coefficients exceeded a certain threshold. Previous research was dedicated to reformulating the classical slip and fall theory to include the stochastic variation of the available and required friction when predicting the probability of slip in gait. However, when predicting the probability of a slip, previous researchers have either ignored the variation in the required friction or assumed the available and required friction to be normally distributed. Also, there are no published results that actually give the probability of slip for various combinations of required and available frictions. This study proposes a modification to the equation for predicting the probability of slip, reducing the previous equation from a double-integral to a more convenient single-integral form. Also, a simple numerical integration technique is provided to predict the probability of slip in gait: the trapezoidal method. The effect of the random variable distributions on the probability of slip is also studied. It is shown that both the required and available friction distributions cannot automatically be assumed as being normally distributed. The proposed methods allow for any combination of distributions for the available and required friction, and numerical results are compared to analytical solutions for an error analysis. The trapezoidal method is shown to be highly accurate and efficient. The probability of slip is also shown to be sensitive to the input distributions of the required and available friction. Lastly, a critical value for the probability of slip is proposed based on the number of steps taken by an average person in a single day.

  2. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution

    PubMed Central

    Dinov, Ivo D.; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2014-01-01

    Summary Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students’ understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference. PMID:25419016

  3. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution.

    PubMed

    Dinov, Ivo D; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2013-01-01

    Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students' understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference.

  4. Including operational data in QMRA model: development and impact of model inputs.

    PubMed

    Jaidi, Kenza; Barbeau, Benoit; Carrière, Annie; Desjardins, Raymond; Prévost, Michèle

    2009-03-01

    A Monte Carlo model, based on the Quantitative Microbial Risk Analysis approach (QMRA), has been developed to assess the relative risks of infection associated with the presence of Cryptosporidium and Giardia in drinking water. The impact of various approaches for modelling the initial parameters of the model on the final risk assessments is evaluated. The Monte Carlo simulations that we performed showed that the occurrence of parasites in raw water was best described by a mixed distribution: log-Normal for concentrations > detection limit (DL), and a uniform distribution for concentrations < DL. The selection of process performance distributions for modelling the performance of treatment (filtration and ozonation) influences the estimated risks significantly. The mean annual risks for conventional treatment are: 1.97E-03 (removal credit adjusted by log parasite = log spores), 1.58E-05 (log parasite = 1.7 x log spores) or 9.33E-03 (regulatory credits based on the turbidity measurement in filtered water). Using full scale validated SCADA data, the simplified calculation of CT performed at the plant was shown to largely underestimate the risk relative to a more detailed CT calculation, which takes into consideration the downtime and system failure events identified at the plant (1.46E-03 vs. 3.93E-02 for the mean risk).

  5. Performance of statistical models to predict mental health and substance abuse cost.

    PubMed

    Montez-Rath, Maria; Christiansen, Cindy L; Ettner, Susan L; Loveland, Susan; Rosen, Amy K

    2006-10-26

    Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.

  6. Statistics of Advective Stretching in Three-dimensional Incompressible Flows

    NASA Astrophysics Data System (ADS)

    Subramanian, Natarajan; Kellogg, Louise H.; Turcotte, Donald L.

    2009-09-01

    We present a method to quantify kinematic stretching in incompressible, unsteady, isoviscous, three-dimensional flows. We extend the method of Kellogg and Turcotte (J. Geophys. Res. 95:421-432, 1990) to compute the axial stretching/thinning experienced by infinitesimal ellipsoidal strain markers in arbitrary three-dimensional incompressible flows and discuss the differences between our method and the computation of Finite Time Lyapunov Exponent (FTLE). We use the cellular flow model developed in Solomon and Mezic (Nature 425:376-380, 2003) to study the statistics of stretching in a three-dimensional unsteady cellular flow. We find that the probability density function of the logarithm of normalised cumulative stretching (log S) for a globally chaotic flow, with spatially heterogeneous stretching behavior, is not Gaussian and that the coefficient of variation of the Gaussian distribution does not decrease with time as t^{-1/2} . However, it is observed that stretching becomes exponential log S˜ t and the probability density function of log S becomes Gaussian when the time dependence of the flow and its three-dimensionality are increased to make the stretching behaviour of the flow more spatially uniform. We term these behaviors weak and strong chaotic mixing respectively. We find that for strongly chaotic mixing, the coefficient of variation of the Gaussian distribution decreases with time as t^{-1/2} . This behavior is consistent with a random multiplicative stretching process.

  7. Computer program determines exact two-sided tolerance limits for normal distributions

    NASA Technical Reports Server (NTRS)

    Friedman, H. A.; Webb, S. R.

    1968-01-01

    Computer program determines by numerical integration the exact statistical two-sided tolerance limits, when the proportion between the limits is at least a specified number. The program is limited to situations in which the underlying probability distribution for the population sampled is the normal distribution with unknown mean and variance.

  8. Spatial distribution of pH and organic matter in urban soils and its implications on site-specific land uses in Xuzhou, China.

    PubMed

    Mao, Yingming; Sang, Shuxun; Liu, Shiqi; Jia, Jinlong

    2014-05-01

    The spatial variation of soil pH and soil organic matter (SOM) in the urban area of Xuzhou, China, was investigated in this study. Conventional statistics, geostatistics, and a geographical information system (GIS) were used to produce spatial distribution maps and to provide information about land use types. A total of 172 soil samples were collected based on grid method in the study area. Soil pH ranged from 6.47 to 8.48, with an average of 7.62. SOM content was very variable, ranging from 3.51 g/kg to 17.12 g/kg, with an average of 8.26 g/kg. Soil pH followed a normal distribution, while SOM followed a log-normal distribution. The results of semi-variograms indicated that soil pH and SOM had strong (21%) and moderate (44%) spatial dependence, respectively. The variogram model was spherical for soil pH and exponential for SOM. The spatial distribution maps were achieved using kriging interpolation. The high pH and high SOM tended to occur in the mixed forest land cover areas such as those in the southwestern part of the urban area, while the low values were found in the eastern and the northern parts, probably due to the effect of industrial and human activities. In the central urban area, the soil pH was low, but the SOM content was high, which is mainly attributed to the disturbance of regional resident activities and urban transportation. Furthermore, anthropogenic organic particles are possible sources of organic matter after entering the soil ecosystem in urban areas. These maps provide useful information for urban planning and environmental management. Copyright © 2014 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  9. A Fast Numerical Method for Max-Convolution and the Application to Efficient Max-Product Inference in Bayesian Networks.

    PubMed

    Serang, Oliver

    2015-08-01

    Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.

  10. [Establishment of the mathematic model of total quantum statistical moment standard similarity for application to medical theoretical research].

    PubMed

    He, Fu-yuan; Deng, Kai-wen; Huang, Sheng; Liu, Wen-long; Shi, Ji-lian

    2013-09-01

    The paper aims to elucidate and establish a new mathematic model: the total quantum statistical moment standard similarity (TQSMSS) on the base of the original total quantum statistical moment model and to illustrate the application of the model to medical theoretical research. The model was established combined with the statistical moment principle and the normal distribution probability density function properties, then validated and illustrated by the pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical method for them, and by analysis of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving the Buyanghanwu-decoction extract. The established model consists of four mainly parameters: (1) total quantum statistical moment similarity as ST, an overlapped area by two normal distribution probability density curves in conversion of the two TQSM parameters; (2) total variability as DT, a confidence limit of standard normal accumulation probability which is equal to the absolute difference value between the two normal accumulation probabilities within integration of their curve nodical; (3) total variable probability as 1-Ss, standard normal distribution probability within interval of D(T); (4) total variable probability (1-beta)alpha and (5) stable confident probability beta(1-alpha): the correct probability to make positive and negative conclusions under confident coefficient alpha. With the model, we had analyzed the TQSMS similarities of pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical methods for them were at range of 0.3852-0.9875 that illuminated different pharmacokinetic behaviors of each other; and the TQSMS similarities (ST) of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving Buyanghuanwu-decoction-extract were at range of 0.6842-0.999 2 that showed different constituents with various solvent extracts. The TQSMSS can characterize the sample similarity, by which we can quantitate the correct probability with the test of power under to make positive and negative conclusions no matter the samples come from same population under confident coefficient a or not, by which we can realize an analysis at both macroscopic and microcosmic levels, as an important similar analytical method for medical theoretical research.

  11. Income distribution dependence of poverty measure: A theoretical analysis

    NASA Astrophysics Data System (ADS)

    Chattopadhyay, Amit K.; Mallick, Sushanta K.

    2007-04-01

    Using a modified deprivation (or poverty) function, in this paper, we theoretically study the changes in poverty with respect to the ‘global’ mean and variance of the income distribution using Indian survey data. We show that when the income obeys a log-normal distribution, a rising mean income generally indicates a reduction in poverty while an increase in the variance of the income distribution increases poverty. This altruistic view for a developing economy, however, is not tenable anymore once the poverty index is found to follow a pareto distribution. Here although a rising mean income indicates a reduction in poverty, due to the presence of an inflexion point in the poverty function, there is a critical value of the variance below which poverty decreases with increasing variance while beyond this value, poverty undergoes a steep increase followed by a decrease with respect to higher variance. Identifying this inflexion point as the poverty line, we show that the pareto poverty function satisfies all three standard axioms of a poverty index [N.C. Kakwani, Econometrica 43 (1980) 437; A.K. Sen, Econometrica 44 (1976) 219] whereas the log-normal distribution falls short of this requisite. Following these results, we make quantitative predictions to correlate a developing with a developed economy.

  12. Application of Statistically Derived CPAS Parachute Parameters

    NASA Technical Reports Server (NTRS)

    Romero, Leah M.; Ray, Eric S.

    2013-01-01

    The Capsule Parachute Assembly System (CPAS) Analysis Team is responsible for determining parachute inflation parameters and dispersions that are ultimately used in verifying system requirements. A model memo is internally released semi-annually documenting parachute inflation and other key parameters reconstructed from flight test data. Dispersion probability distributions published in previous versions of the model memo were uniform because insufficient data were available for determination of statistical based distributions. Uniform distributions do not accurately represent the expected distributions since extreme parameter values are just as likely to occur as the nominal value. CPAS has taken incremental steps to move away from uniform distributions. Model Memo version 9 (MMv9) made the first use of non-uniform dispersions, but only for the reefing cutter timing, for which a large number of sample was available. In order to maximize the utility of the available flight test data, clusters of parachutes were reconstructed individually starting with Model Memo version 10. This allowed for statistical assessment for steady-state drag area (CDS) and parachute inflation parameters such as the canopy fill distance (n), profile shape exponent (expopen), over-inflation factor (C(sub k)), and ramp-down time (t(sub k)) distributions. Built-in MATLAB distributions were applied to the histograms, and parameters such as scale (sigma) and location (mu) were output. Engineering judgment was used to determine the "best fit" distribution based on the test data. Results include normal, log normal, and uniform (where available data remains insufficient) fits of nominal and failure (loss of parachute and skipped stage) cases for all CPAS parachutes. This paper discusses the uniform methodology that was previously used, the process and result of the statistical assessment, how the dispersions were incorporated into Monte Carlo analyses, and the application of the distributions in trajectory benchmark testing assessments with parachute inflation parameters, drag area, and reefing cutter timing used by CPAS.

  13. On the probability distribution function of the mass surface density of molecular clouds. I

    NASA Astrophysics Data System (ADS)

    Fischera, Jörg

    2014-05-01

    The probability distribution function (PDF) of the mass surface density is an essential characteristic of the structure of molecular clouds or the interstellar medium in general. Observations of the PDF of molecular clouds indicate a composition of a broad distribution around the maximum and a decreasing tail at high mass surface densities. The first component is attributed to the random distribution of gas which is modeled using a log-normal function while the second component is attributed to condensed structures modeled using a simple power-law. The aim of this paper is to provide an analytical model of the PDF of condensed structures which can be used by observers to extract information about the condensations. The condensed structures are considered to be either spheres or cylinders with a truncated radial density profile at cloud radius rcl. The assumed profile is of the form ρ(r) = ρc/ (1 + (r/r0)2)n/ 2 for arbitrary power n where ρc and r0 are the central density and the inner radius, respectively. An implicit function is obtained which either truncates (sphere) or has a pole (cylinder) at maximal mass surface density. The PDF of spherical condensations and the asymptotic PDF of cylinders in the limit of infinite overdensity ρc/ρ(rcl) flattens for steeper density profiles and has a power law asymptote at low and high mass surface densities and a well defined maximum. The power index of the asymptote Σ- γ of the logarithmic PDF (ΣP(Σ)) in the limit of high mass surface densities is given by γ = (n + 1)/(n - 1) - 1 (spheres) or by γ = n/ (n - 1) - 1 (cylinders in the limit of infinite overdensity). Appendices are available in electronic form at http://www.aanda.org

  14. Comparison of hypertabastic survival model with other unimodal hazard rate functions using a goodness-of-fit test.

    PubMed

    Tahir, M Ramzan; Tran, Quang X; Nikulin, Mikhail S

    2017-05-30

    We studied the problem of testing a hypothesized distribution in survival regression models when the data is right censored and survival times are influenced by covariates. A modified chi-squared type test, known as Nikulin-Rao-Robson statistic, is applied for the comparison of accelerated failure time models. This statistic is used to test the goodness-of-fit for hypertabastic survival model and four other unimodal hazard rate functions. The results of simulation study showed that the hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution. In statistical modeling, because of its flexible shape of hazard functions, this distribution can also be used as a competitor of Birnbaum-Saunders and inverse Gaussian distributions. The results for the real data application are shown. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  15. SMALL-SCALE AND GLOBAL DYNAMOS AND THE AREA AND FLUX DISTRIBUTIONS OF ACTIVE REGIONS, SUNSPOT GROUPS, AND SUNSPOTS: A MULTI-DATABASE STUDY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Muñoz-Jaramillo, Andrés; Windmueller, John C.; Amouzou, Ernest C.

    2015-02-10

    In this work, we take advantage of 11 different sunspot group, sunspot, and active region databases to characterize the area and flux distributions of photospheric magnetic structures. We find that, when taken separately, different databases are better fitted by different distributions (as has been reported previously in the literature). However, we find that all our databases can be reconciled by the simple application of a proportionality constant, and that, in reality, different databases are sampling different parts of a composite distribution. This composite distribution is made up by linear combination of Weibull and log-normal distributions—where a pure Weibull (log-normal) characterizesmore » the distribution of structures with fluxes below (above) 10{sup 21}Mx (10{sup 22}Mx). Additionally, we demonstrate that the Weibull distribution shows the expected linear behavior of a power-law distribution (when extended to smaller fluxes), making our results compatible with the results of Parnell et al. We propose that this is evidence of two separate mechanisms giving rise to visible structures on the photosphere: one directly connected to the global component of the dynamo (and the generation of bipolar active regions), and the other with the small-scale component of the dynamo (and the fragmentation of magnetic structures due to their interaction with turbulent convection)« less

  16. Neyman Pearson detection of K-distributed random variables

    NASA Astrophysics Data System (ADS)

    Tucker, J. Derek; Azimi-Sadjadi, Mahmood R.

    2010-04-01

    In this paper a new detection method for sonar imagery is developed in K-distributed background clutter. The equation for the log-likelihood is derived and compared to the corresponding counterparts derived for the Gaussian and Rayleigh assumptions. Test results of the proposed method on a data set of synthetic underwater sonar images is also presented. This database contains images with targets of different shapes inserted into backgrounds generated using a correlated K-distributed model. Results illustrating the effectiveness of the K-distributed detector are presented in terms of probability of detection, false alarm, and correct classification rates for various bottom clutter scenarios.

  17. Inference of strata separation and gas emission paths in longwall overburden using continuous wavelet transform of well logs and geostatistical simulation

    NASA Astrophysics Data System (ADS)

    Karacan, C. Özgen; Olea, Ricardo A.

    2014-06-01

    Prediction of potential methane emission pathways from various sources into active mine workings or sealed gobs from longwall overburden is important for controlling methane and for improving mining safety. The aim of this paper is to infer strata separation intervals and thus gas emission pathways from standard well log data. The proposed technique was applied to well logs acquired through the Mary Lee/Blue Creek coal seam of the Upper Pottsville Formation in the Black Warrior Basin, Alabama, using well logs from a series of boreholes aligned along a nearly linear profile. For this purpose, continuous wavelet transform (CWT) of digitized gamma well logs was performed by using Mexican hat and Morlet, as the mother wavelets, to identify potential discontinuities in the signal. Pointwise Hölder exponents (PHE) of gamma logs were also computed using the generalized quadratic variations (GQV) method to identify the location and strength of singularities of well log signals as a complementary analysis. PHEs and wavelet coefficients were analyzed to find the locations of singularities along the logs. Using the well logs in this study, locations of predicted singularities were used as indicators in single normal equation simulation (SNESIM) to generate equi-probable realizations of potential strata separation intervals. Horizontal and vertical variograms of realizations were then analyzed and compared with those of indicator data and training image (TI) data using the Kruskal-Wallis test. A sum of squared differences was employed to select the most probable realization representing the locations of potential strata separations and methane flow paths. Results indicated that singularities located in well log signals reliably correlated with strata transitions or discontinuities within the strata. Geostatistical simulation of these discontinuities provided information about the location and extents of the continuous channels that may form during mining. If there is a gas source within their zone of influence, paths may develop and allow methane movement towards sealed or active gobs under pressure differentials. Knowledge gained from this research will better prepare mine operations for potential methane inflows, thus improving mine safety.

  18. Tools for Basic Statistical Analysis

    NASA Technical Reports Server (NTRS)

    Luz, Paul L.

    2005-01-01

    Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.

  19. Probabilistic structural analysis of a truss typical for space station

    NASA Technical Reports Server (NTRS)

    Pai, Shantaram S.

    1990-01-01

    A three-bay, space, cantilever truss is probabilistically evaluated using the computer code NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) to identify and quantify the uncertainties and respective sensitivities associated with corresponding uncertainties in the primitive variables (structural, material, and loads parameters) that defines the truss. The distribution of each of these primitive variables is described in terms of one of several available distributions such as the Weibull, exponential, normal, log-normal, etc. The cumulative distribution function (CDF's) for the response functions considered and sensitivities associated with the primitive variables for given response are investigated. These sensitivities help in determining the dominating primitive variables for that response.

  20. X-ray binary formation in low-metallicity blue compact dwarf galaxies

    NASA Astrophysics Data System (ADS)

    Brorby, M.; Kaaret, P.; Prestwich, A.

    2014-07-01

    X-rays from binaries in small, metal-deficient galaxies may have contributed significantly to the heating and reionization of the early Universe. We investigate this claim by studying blue compact dwarfs (BCDs) as local analogues to these early galaxies. We constrain the relation of the X-ray luminosity function (XLF) to the star formation rate (SFR) using a Bayesian approach applied to a sample of 25 BCDs. The functional form of the XLF is fixed to that found for near-solar metallicity galaxies and is used to find the probability distribution of the normalization that relates X-ray luminosity to SFR. Our results suggest that the XLF normalization for low-metallicity BCDs (12+log(O/H) < 7.7) is not consistent with the XLF normalization for galaxies with near-solar metallicities, at a confidence level 1-5 × 10- 6. The XLF normalization for the BCDs is found to be 14.5± 4.8 ({M}_{⊙}^{-1} yr), a factor of 9.7 ± 3.2 higher than for near-solar metallicity galaxies. Simultaneous determination of the XLF normalization and power-law index result in estimates of q = 21.2^{+12.2}_{-8.8} ({M}_{⊙}^{-1} yr) and α = 1.89^{+0.41}_{-0.30}, respectively. Our results suggest a significant enhancement in the population of high-mass X-ray binaries in BCDs compared to the near-solar metallicity galaxies. This suggests that X-ray binaries could have been a significant source of heating in the early Universe.

  1. A model for the flux-r.m.s. correlation in blazar variability or the minijets-in-a-jet statistical model

    NASA Astrophysics Data System (ADS)

    Biteau, J.; Giebels, B.

    2012-12-01

    Very high energy gamma-ray variability of blazar emission remains of puzzling origin. Fast flux variations down to the minute time scale, as observed with H.E.S.S. during flares of the blazar PKS 2155-304, suggests that variability originates from the jet, where Doppler boosting can be invoked to relax causal constraints on the size of the emission region. The observation of log-normality in the flux distributions should rule out additive processes, such as those resulting from uncorrelated multiple-zone emission models, and favour an origin of the variability from multiplicative processes not unlike those observed in a broad class of accreting systems. We show, using a simple kinematic model, that Doppler boosting of randomly oriented emitting regions generates flux distributions following a Pareto law, that the linear flux-r.m.s. relation found for a single zone holds for a large number of emitting regions, and that the skewed distribution of the total flux is close to a log-normal, despite arising from an additive process.

  2. Coverage dependent molecular assembly of anthraquinone on Au(111)

    NASA Astrophysics Data System (ADS)

    DeLoach, Andrew S.; Conrad, Brad R.; Einstein, T. L.; Dougherty, Daniel B.

    2017-11-01

    A scanning tunneling microscopy study of anthraquinone (AQ) on the Au(111) surface shows that the molecules self-assemble into several structures depending on the local surface coverage. At high coverages, a close-packed saturated monolayer is observed, while at low coverages, mobile surface molecules coexist with stable chiral hexamer clusters. At intermediate coverages, a disordered 2D porous network interlinking close-packed islands is observed in contrast to the giant honeycomb networks observed for the same molecule on Cu(111). This difference verifies the predicted extreme sensitivity [J. Wyrick et al., Nano Lett. 11, 2944 (2011)] of the pore network to small changes in the surface electronic structure. Quantitative analysis of the 2D pore network reveals that the areas of the vacancy islands are distributed log-normally. Log-normal distributions are typically associated with the product of random variables (multiplicative noise), and we propose that the distribution of pore sizes for AQ on Au(111) originates from random linear rate constants for molecules to either desorb from the surface or detach from the region of a nucleated pore.

  3. Coverage dependent molecular assembly of anthraquinone on Au(111).

    PubMed

    DeLoach, Andrew S; Conrad, Brad R; Einstein, T L; Dougherty, Daniel B

    2017-11-14

    A scanning tunneling microscopy study of anthraquinone (AQ) on the Au(111) surface shows that the molecules self-assemble into several structures depending on the local surface coverage. At high coverages, a close-packed saturated monolayer is observed, while at low coverages, mobile surface molecules coexist with stable chiral hexamer clusters. At intermediate coverages, a disordered 2D porous network interlinking close-packed islands is observed in contrast to the giant honeycomb networks observed for the same molecule on Cu(111). This difference verifies the predicted extreme sensitivity [J. Wyrick et al., Nano Lett. 11, 2944 (2011)] of the pore network to small changes in the surface electronic structure. Quantitative analysis of the 2D pore network reveals that the areas of the vacancy islands are distributed log-normally. Log-normal distributions are typically associated with the product of random variables (multiplicative noise), and we propose that the distribution of pore sizes for AQ on Au(111) originates from random linear rate constants for molecules to either desorb from the surface or detach from the region of a nucleated pore.

  4. Ejected Particle Size Distributions from Shocked Metal Surfaces

    DOE PAGES

    Schauer, M. M.; Buttler, W. T.; Frayer, D. K.; ...

    2017-04-12

    Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.

  5. Ejected Particle Size Distributions from Shocked Metal Surfaces

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schauer, M. M.; Buttler, W. T.; Frayer, D. K.

    Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.

  6. Approximating Multivariate Normal Orthant Probabilities. ONR Technical Report. [Biometric Lab Report No. 90-1.

    ERIC Educational Resources Information Center

    Gibbons, Robert D.; And Others

    The probability integral of the multivariate normal distribution (ND) has received considerable attention since W. F. Sheppard's (1900) and K. Pearson's (1901) seminal work on the bivariate ND. This paper evaluates the formula that represents the "n x n" correlation matrix of the "chi(sub i)" and the standardized multivariate…

  7. Method of estimating flood-frequency parameters for streams in Idaho

    USGS Publications Warehouse

    Kjelstrom, L.C.; Moffatt, R.L.

    1981-01-01

    Skew coefficients for the log-Pearson type III distribution are generalized on the basis of some similarity of floods in the Snake River basin and other parts of Idaho. Generalized skew coefficients aid in shaping flood-frequency curves because skew coefficients computed from gaging stations having relatively short periods of peak flow records can be unreliable. Generalized skew coefficients can be obtained for a gaging station from one of three maps in this report. The map to be used depends on whether (1) snowmelt floods are domiant (generally when more than 20 percent of the drainage area is above 6,000 feet altitude), (2) rainstorm floods are dominant (generally when the mean altitude is less than 3,000 feet), or (3) either snowmelt or rainstorm floods can be the annual miximum discharge. For the latter case, frequency curves constructed using separate arrays of each type of runoff can be combined into one curve, which, for some stations, is significantly different than the frequency curve constructed using only annual maximum discharges. For 269 gaging stations, flood-frequency curves that include the generalized skew coefficients in the computation of the log-Pearson type III equation tend to fit the data better than previous analyses. Frequency curves for ungaged sites can be derived by estimating three statistics of the log-Pearson type III distribution. The mean and standard deviation of logarithms of annual maximum discharges are estimated by regression equations that use basin characteristics as independent variables. Skew coefficient estimates are the generalized skews. The log-Pearson type III equation is then applied with the three estimated statistics to compute the discharge at selected exceedance probabilities. Standard errors at the 2-percent exceedance probability range from 41 to 90 percent. (USGS)

  8. Methane Leaks from Natural Gas Systems Follow Extreme Distributions.

    PubMed

    Brandt, Adam R; Heath, Garvin A; Cooley, Daniel

    2016-11-15

    Future energy systems may rely on natural gas as a low-cost fuel to support variable renewable power. However, leaking natural gas causes climate damage because methane (CH 4 ) has a high global warming potential. In this study, we use extreme-value theory to explore the distribution of natural gas leak sizes. By analyzing ∼15 000 measurements from 18 prior studies, we show that all available natural gas leakage data sets are statistically heavy-tailed, and that gas leaks are more extremely distributed than other natural and social phenomena. A unifying result is that the largest 5% of leaks typically contribute over 50% of the total leakage volume. While prior studies used log-normal model distributions, we show that log-normal functions poorly represent tail behavior. Our results suggest that published uncertainty ranges of CH 4 emissions are too narrow, and that larger sample sizes are required in future studies to achieve targeted confidence intervals. Additionally, we find that cross-study aggregation of data sets to increase sample size is not recommended due to apparent deviation between sampled populations. Understanding the nature of leak distributions can improve emission estimates, better illustrate their uncertainty, allow prioritization of source categories, and improve sampling design. Also, these data can be used for more effective design of leak detection technologies.

  9. Probabilistic analysis of preload in the abutment screw of a dental implant complex.

    PubMed

    Guda, Teja; Ross, Thomas A; Lang, Lisa A; Millwater, Harry R

    2008-09-01

    Screw loosening is a problem for a percentage of implants. A probabilistic analysis to determine the cumulative probability distribution of the preload, the probability of obtaining an optimal preload, and the probabilistic sensitivities identifying important variables is lacking. The purpose of this study was to examine the inherent variability of material properties, surface interactions, and applied torque in an implant system to determine the probability of obtaining desired preload values and to identify the significant variables that affect the preload. Using software programs, an abutment screw was subjected to a tightening torque and the preload was determined from finite element (FE) analysis. The FE model was integrated with probabilistic analysis software. Two probabilistic analysis methods (advanced mean value and Monte Carlo sampling) were applied to determine the cumulative distribution function (CDF) of preload. The coefficient of friction, elastic moduli, Poisson's ratios, and applied torque were modeled as random variables and defined by probability distributions. Separate probability distributions were determined for the coefficient of friction in well-lubricated and dry environments. The probabilistic analyses were performed and the cumulative distribution of preload was determined for each environment. A distinct difference was seen between the preload probability distributions generated in a dry environment (normal distribution, mean (SD): 347 (61.9) N) compared to a well-lubricated environment (normal distribution, mean (SD): 616 (92.2) N). The probability of obtaining a preload value within the target range was approximately 54% for the well-lubricated environment and only 0.02% for the dry environment. The preload is predominately affected by the applied torque and coefficient of friction between the screw threads and implant bore at lower and middle values of the preload CDF, and by the applied torque and the elastic modulus of the abutment screw at high values of the preload CDF. Lubrication at the threaded surfaces between the abutment screw and implant bore affects the preload developed in the implant complex. For the well-lubricated surfaces, only approximately 50% of implants will have preload values within the generally accepted range. This probability can be improved by applying a higher torque than normally recommended or a more closely controlled torque than typically achieved. It is also suggested that materials with higher elastic moduli be used in the manufacture of the abutment screw to achieve a higher preload.

  10. Behavioral Analysis of Visitors to a Medical Institution's Website Using Markov Chain Monte Carlo Methods.

    PubMed

    Suzuki, Teppei; Tani, Yuji; Ogasawara, Katsuhiko

    2016-07-25

    Consistent with the "attention, interest, desire, memory, action" (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. In the case of the keyword "clinic name," the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the keyword "clinic name and regional name," the probability for a repeated visit to the website and the mammography screening page was negative. In the case of the keyword "clinic name + medical examination," the visit probability to the website was positive, and the visit probability to the information page was negative. When visitors referred to the keywords "mammography screening," the visit probability to the mammography screening page was positive (95% highest posterior density interval = 3.38-26.66). Further analysis for not only the clinic website but also various other medical institution websites is necessary to build a general inspection model for medical institution websites; we want to consider this in future research. Additionally, we hope to use the results obtained in this study as a prior distribution for future work to conduct higher-precision analysis.

  11. Behavioral Analysis of Visitors to a Medical Institution’s Website Using Markov Chain Monte Carlo Methods

    PubMed Central

    Tani, Yuji

    2016-01-01

    Background Consistent with the “attention, interest, desire, memory, action” (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. Objective We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. Methods We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. Results In the case of the keyword “clinic name,” the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the keyword “clinic name and regional name,” the probability for a repeated visit to the website and the mammography screening page was negative. In the case of the keyword “clinic name + medical examination,” the visit probability to the website was positive, and the visit probability to the information page was negative. When visitors referred to the keywords “mammography screening,” the visit probability to the mammography screening page was positive (95% highest posterior density interval = 3.38-26.66). Conclusions Further analysis for not only the clinic website but also various other medical institution websites is necessary to build a general inspection model for medical institution websites; we want to consider this in future research. Additionally, we hope to use the results obtained in this study as a prior distribution for future work to conduct higher-precision analysis. PMID:27457537

  12. Bayes classification of terrain cover using normalized polarimetric data

    NASA Technical Reports Server (NTRS)

    Yueh, H. A.; Swartz, A. A.; Kong, J. A.; Shin, R. T.; Novak, L. M.

    1988-01-01

    The normalized polarimetric classifier (NPC) which uses only the relative magnitudes and phases of the polarimetric data is proposed for discrimination of terrain elements. The probability density functions (PDFs) of polarimetric data are assumed to have a complex Gaussian distribution, and the marginal PDF of the normalized polarimetric data is derived by adopting the Euclidean norm as the normalization function. The general form of the distance measure for the NPC is also obtained. It is demonstrated that for polarimetric data with an arbitrary PDF, the distance measure of NPC will be independent of the normalization function selected even when the classifier is mistrained. A complex Gaussian distribution is assumed for the polarimetric data consisting of grass and tree regions. The probability of error for the NPC is compared with those of several other single-feature classifiers. The classification error of NPCs is shown to be independent of the normalization function.

  13. Analysis of meteorological droughts and dry spells in semiarid regions: a comparative analysis of probability distribution functions in the Segura Basin (SE Spain)

    NASA Astrophysics Data System (ADS)

    Pérez-Sánchez, Julio; Senent-Aparicio, Javier

    2017-08-01

    Dry spells are an essential concept of drought climatology that clearly defines the semiarid Mediterranean environment and whose consequences are a defining feature for an ecosystem, so vulnerable with regard to water. The present study was conducted to characterize rainfall drought in the Segura River basin located in eastern Spain, marked by the self seasonal nature of these latitudes. A daily precipitation set has been utilized for 29 weather stations during a period of 20 years (1993-2013). Furthermore, four sets of dry spell length (complete series, monthly maximum, seasonal maximum, and annual maximum) are used and simulated for all the weather stations with the following probability distribution functions: Burr, Dagum, error, generalized extreme value, generalized logistic, generalized Pareto, Gumbel Max, inverse Gaussian, Johnson SB, Log-Logistic, Log-Pearson 3, Triangular, Weibull, and Wakeby. Only the series of annual maximum spell offer a good adjustment for all the weather stations, thereby gaining the role of Wakeby as the best result, with a p value means of 0.9424 for the Kolmogorov-Smirnov test (0.2 significance level). Probability of dry spell duration for return periods of 2, 5, 10, and 25 years maps reveal the northeast-southeast gradient, increasing periods with annual rainfall of less than 0.1 mm in the eastern third of the basin, in the proximity of the Mediterranean slope.

  14. VLSI Design, Parallel Computation and Distributed Computing

    DTIC Science & Technology

    1991-09-30

    I U1 TA 3 Daniel Mleitman U. : C ..( -_. .. .s .. . . . . Tom Leighton David Shmoys . ........A ,~i ;.t , 77 Michael Sipser , Di.,t a-., Eva Tardos...Leighton and Plaxton on the construction of a sim- ple c log .- depth circuit (where c < 7.5) that sorts a random permutation with very high probability...puting iPOD( ). Aug-ust 1992. Vancouver. British Columbia (to appear). 20. B 1Xti~ c .. U(.ii. 1. Gopal. M. [Kaplan and S. Kutten, "Distributed Control for

  15. Partial knowledge, entropy, and estimation

    PubMed Central

    MacQueen, James; Marschak, Jacob

    1975-01-01

    In a growing body of literature, available partial knowledge is used to estimate the prior probability distribution p≡(p1,...,pn) by maximizing entropy H(p)≡-Σpi log pi, subject to constraints on p which express that partial knowledge. The method has been applied to distributions of income, of traffic, of stock-price changes, and of types of brand-article purchases. We shall respond to two justifications given for the method: (α) It is “conservative,” and therefore good, to maximize “uncertainty,” as (uniquely) represented by the entropy parameter. (β) One should apply the mathematics of statistical thermodynamics, which implies that the most probable distribution has highest entropy. Reason (α) is rejected. Reason (β) is valid when “complete ignorance” is defined in a particular way and both the constraint and the estimator's loss function are of certain kinds. PMID:16578733

  16. Predicting clicks of PubMed articles.

    PubMed

    Mao, Yuqing; Lu, Zhiyong

    2013-01-01

    Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed.

  17. Predicting clicks of PubMed articles

    PubMed Central

    Mao, Yuqing; Lu, Zhiyong

    2013-01-01

    Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed. PMID:24551386

  18. Flood Frequency Curves - Use of information on the likelihood of extreme floods

    NASA Astrophysics Data System (ADS)

    Faber, B.

    2011-12-01

    Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.

  19. Probability Weighting Functions Derived from Hyperbolic Time Discounting: Psychophysical Models and Their Individual Level Testing.

    PubMed

    Takemura, Kazuhisa; Murakami, Hajime

    2016-01-01

    A probability weighting function (w(p)) is considered to be a nonlinear function of probability (p) in behavioral decision theory. This study proposes a psychophysical model of probability weighting functions derived from a hyperbolic time discounting model and a geometric distribution. The aim of the study is to show probability weighting functions from the point of view of waiting time for a decision maker. Since the expected value of a geometrically distributed random variable X is 1/p, we formulized the probability weighting function of the expected value model for hyperbolic time discounting as w(p) = (1 - k log p)(-1). Moreover, the probability weighting function is derived from Loewenstein and Prelec's (1992) generalized hyperbolic time discounting model. The latter model is proved to be equivalent to the hyperbolic-logarithmic weighting function considered by Prelec (1998) and Luce (2001). In this study, we derive a model from the generalized hyperbolic time discounting model assuming Fechner's (1860) psychophysical law of time and a geometric distribution of trials. In addition, we develop median models of hyperbolic time discounting and generalized hyperbolic time discounting. To illustrate the fitness of each model, a psychological experiment was conducted to assess the probability weighting and value functions at the level of the individual participant. The participants were 50 university students. The results of individual analysis indicated that the expected value model of generalized hyperbolic discounting fitted better than previous probability weighting decision-making models. The theoretical implications of this finding are discussed.

  20. Statistical distribution of building lot frontage: application for Tokyo downtown districts

    NASA Astrophysics Data System (ADS)

    Usui, Hiroyuki

    2018-03-01

    The frontage of a building lot is the determinant factor of the residential environment. The statistical distribution of building lot frontages shows how the perimeters of urban blocks are shared by building lots for a given density of buildings and roads. For practitioners in urban planning, this is indispensable to identify potential districts which comprise a high percentage of building lots with narrow frontage after subdivision and to reconsider the appropriate criteria for the density of buildings and roads as residential environment indices. In the literature, however, the statistical distribution of building lot frontages and the density of buildings and roads has not been fully researched. In this paper, based on the empirical study in the downtown districts of Tokyo, it is found that (1) a log-normal distribution fits the observed distribution of building lot frontages better than a gamma distribution, which is the model of the size distribution of Poisson Voronoi cells on closed curves; (2) the statistical distribution of building lot frontages statistically follows a log-normal distribution, whose parameters are the gross building density, road density, average road width, the coefficient of variation of building lot frontage, and the ratio of the number of building lot frontages to the number of buildings; and (3) the values of the coefficient of variation of building lot frontages, and that of the ratio of the number of building lot frontages to that of buildings are approximately equal to 0.60 and 1.19, respectively.

  1. Fatigue shifts and scatters heart rate variability in elite endurance athletes.

    PubMed

    Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire

    2013-01-01

    This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in 'fatigue' or in 'no-fatigue' state in 'real life' conditions. 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms(2) and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). 172 trials were identified as in a 'fatigue' and 891 as in 'no-fatigue' state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between 'fatigue' and 'no-fatigue': HRSU (+6.27±0.61 bpm), logTPSU (-0.36±0.04), logLFSU (-0.27±0.04), logHFSU (-0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (-9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (-0.28±0.03), logLFST (-0.29±0.03), logHFST (-0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the 'fatigue' state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern.

  2. Log-normal spray drop distribution...analyzed by two new computer programs

    Treesearch

    Gerald S. Walton

    1968-01-01

    Results of U.S. Forest Service research on chemical insecticides suggest that large drops are not as effective as small drops in carrying insecticides to target insects. Two new computer programs have been written to analyze size distribution properties of drops from spray nozzles. Coded in Fortran IV, the programs have been tested on both the CDC 6400 and the IBM 7094...

  3. Radium-226 content of beverages

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kiefer, J.

    Radium contents of commercially obtained beer, wine, milk and mineral waters were measured. All distributions were log-normal with the following geometrical mean values: beer: 2.1 X 10(-2) Bq L-1; wine: 3.4 X 10(-2) Bq L-1; milk: 3 X 10(-3) Bq L-1; normal mineral water: 4.3 X 10(-2) L-1; medical mineral water: 9.4 X 10(-2) Bq L-1.

  4. Investigation into the performance of different models for predicting stutter.

    PubMed

    Bright, Jo-Anne; Curran, James M; Buckleton, John S

    2013-07-01

    In this paper we have examined five possible models for the behaviour of the stutter ratio, SR. These were two log-normal models, two gamma models, and a two-component normal mixture model. A two-component normal mixture model was chosen with different behaviours of variance; at each locus SR was described with two distributions, both with the same mean. The distributions have difference variances: one for the majority of the observations and a second for the less well-behaved ones. We apply each model to a set of known single source Identifiler™, NGM SElect™ and PowerPlex(®) 21 DNA profiles to show the applicability of our findings to different data sets. SR determined from the single source profiles were compared to the calculated SR after application of the models. The model performance was tested by calculating the log-likelihoods and comparing the difference in Akaike information criterion (AIC). The two-component normal mixture model systematically outperformed all others, despite the increase in the number of parameters. This model, as well as performing well statistically, has intuitive appeal for forensic biologists and could be implemented in an expert system with a continuous method for DNA interpretation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. Rapid changes in ice core gas records - Part 1: On the accuracy of methane synchronisation of ice cores

    NASA Astrophysics Data System (ADS)

    Köhler, P.

    2010-08-01

    Methane synchronisation is a concept to align ice core records during rapid climate changes of the Dansgaard/Oeschger (D/O) events onto a common age scale. However, atmospheric gases are recorded in ice cores with a log-normal-shaped age distribution probability density function, whose exact shape depends mainly on the accumulation rate on the drilling site. This age distribution effectively shifts the mid-transition points of rapid changes in CH4 measured in situ in ice by about 58% of the width of the age distribution with respect to the atmospheric signal. A minimum dating uncertainty, or artefact, in the CH4 synchronisation is therefore embedded in the concept itself, which was not accounted for in previous error estimates. This synchronisation artefact between Greenland and Antarctic ice cores is for GRIP and Byrd less than 40 years, well within the dating uncertainty of CH4, and therefore does not calls the overall concept of the bipolar seesaw into question. However, if the EPICA Dome C ice core is aligned via CH4 to NGRIP this synchronisation artefact is in the most recent unified ice core age scale (Lemieux-Dudon et al., 2010) for LGM climate conditions of the order of three centuries and might need consideration in future gas chronologies.

  6. Bridging stylized facts in finance and data non-stationarities

    NASA Astrophysics Data System (ADS)

    Camargo, Sabrina; Duarte Queirós, Sílvio M.; Anteneodo, Celia

    2013-04-01

    Employing a recent technique which allows the representation of nonstationary data by means of a juxtaposition of locally stationary paths of different length, we introduce a comprehensive analysis of the key observables in a financial market: the trading volume and the price fluctuations. From the segmentation procedure we are able to introduce a quantitative description of statistical features of these two quantities, which are often named stylized facts, namely the tails of the distribution of trading volume and price fluctuations and a dynamics compatible with the U-shaped profile of the volume in a trading section and the slow decay of the autocorrelation function. The segmentation of the trading volume series provides evidence of slow evolution of the fluctuating parameters of each patch, pointing to the mixing scenario. Assuming that long-term features are the outcome of a statistical mixture of simple local forms, we test and compare different probability density functions to provide the long-term distribution of the trading volume, concluding that the log-normal gives the best agreement with the empirical distribution. Moreover, the segmentation of the magnitude price fluctuations are quite different from the results for the trading volume, indicating that changes in the statistics of price fluctuations occur at a faster scale than in the case of trading volume.

  7. PERFLUORINATED COMPOUNDS IN ARCHIVED HOUSE-DUST SAMPLES

    EPA Science Inventory

    Archived house-dust samples were analyzed for 13 perfluorinated compounds (PFCs). Results show that PFCs are found in house-dust samples, and the data are log-normally distributed. PFOS/PFOA were present in 94.6% and 96.4% of the samples respectively. Concentrations ranged fro...

  8. Probabilistic Tsunami Hazard Assessment along Nankai Trough (1) An assessment based on the information of the forthcoming earthquake that Earthquake Research Committee(2013) evaluated

    NASA Astrophysics Data System (ADS)

    Hirata, K.; Fujiwara, H.; Nakamura, H.; Osada, M.; Morikawa, N.; Kawai, S.; Ohsumi, T.; Aoi, S.; Yamamoto, N.; Matsuyama, H.; Toyama, N.; Kito, T.; Murashima, Y.; Murata, Y.; Inoue, T.; Saito, R.; Takayama, J.; Akiyama, S.; Korenaga, M.; Abe, Y.; Hashimoto, N.

    2015-12-01

    The Earthquake Research Committee(ERC)/HERP, Government of Japan (2013) revised their long-term evaluation of the forthcoming large earthquake along the Nankai Trough; the next earthquake is estimated M8 to 9 class, and the probability (P30) that the next earthquake will occur within the next 30 years (from Jan. 1, 2013) is 60% to 70%. In this study, we assess tsunami hazards (maximum coastal tsunami heights) in the near future, in terms of a probabilistic approach, from the next earthquake along Nankai Trough, on the basis of ERC(2013)'s report. The probabilistic tsunami hazard assessment that we applied is as follows; (1) Characterized earthquake fault models (CEFMs) are constructed on each of the 15 hypothetical source areas (HSA) that ERC(2013) showed. The characterization rule follows Toyama et al.(2015, JpGU). As results, we obtained total of 1441 CEFMs. (2) We calculate tsunamis due to CEFMs by solving nonlinear, finite-amplitude, long-wave equations with advection and bottom friction terms by finite-difference method. Run-up computation on land is included. (3) A time predictable model predicts the recurrent interval of the present seismic cycle is T=88.2 years (ERC,2013). We fix P30 = 67% by applying the renewal process based on BPT distribution with T and alpha=0.24 as its aperiodicity. (4) We divide the probability P30 into P30(i) for i-th subgroup consisting of the earthquakes occurring in each of 15 HSA by following a probability re-distribution concept (ERC,2014). Then each earthquake (CEFM) in i-th subgroup is assigned a probability P30(i)/N where N is the number of CEFMs in each sub-group. Note that such re-distribution concept of the probability is nothing but tentative because the present seismology cannot give deep knowledge enough to do it. Epistemic logic-tree approach may be required in future. (5) We synthesize a number of tsunami hazard curves at every evaluation points on coasts by integrating the information about 30 years occurrence probabilities P30(i) for all earthquakes (CEFMs) and calculated maximum coastal tsunami heights. In the synthesis, aleatory uncertainties relating to incompleteness of governing equations, CEFM modeling, bathymetry and topography data, etc, are modeled assuming a log-normal probabilistic distribution. Examples of tsunami hazard curves will be presented.

  9. Statistical Considerations of Data Processing in Giovanni Online Tool

    NASA Technical Reports Server (NTRS)

    Suhung, Shen; Leptoukh, G.; Acker, J.; Berrick, S.

    2005-01-01

    The GES DISC Interactive Online Visualization and Analysis Infrastructure (Giovanni) is a web-based interface for the rapid visualization and analysis of gridded data from a number of remote sensing instruments. The GES DISC currently employs several Giovanni instances to analyze various products, such as Ocean-Giovanni for ocean products from SeaWiFS and MODIS-Aqua; TOMS & OM1 Giovanni for atmospheric chemical trace gases from TOMS and OMI, and MOVAS for aerosols from MODIS, etc. (http://giovanni.gsfc.nasa.gov) Foremost among the Giovanni statistical functions is data averaging. Two aspects of this function are addressed here. The first deals with the accuracy of averaging gridded mapped products vs. averaging from the ungridded Level 2 data. Some mapped products contain mean values only; others contain additional statistics, such as number of pixels (NP) for each grid, standard deviation, etc. Since NP varies spatially and temporally, averaging with or without weighting by NP will be different. In this paper, we address differences of various weighting algorithms for some datasets utilized in Giovanni. The second aspect is related to different averaging methods affecting data quality and interpretation for data with non-normal distribution. The present study demonstrates results of different spatial averaging methods using gridded SeaWiFS Level 3 mapped monthly chlorophyll a data. Spatial averages were calculated using three different methods: arithmetic mean (AVG), geometric mean (GEO), and maximum likelihood estimator (MLE). Biogeochemical data, such as chlorophyll a, are usually considered to have a log-normal distribution. The study determined that differences between methods tend to increase with increasing size of a selected coastal area, with no significant differences in most open oceans. The GEO method consistently produces values lower than AVG and MLE. The AVG method produces values larger than MLE in some cases, but smaller in other cases. Further studies indicated that significant differences between AVG and MLE methods occurred in coastal areas where data have large spatial variations and a log-bimodal distribution instead of log-normal distribution.

  10. A novel gamma-fitting statistical method for anti-drug antibody assays to establish assay cut points for data with non-normal distribution.

    PubMed

    Schlain, Brian; Amaravadi, Lakshmi; Donley, Jean; Wickramasekera, Ananda; Bennett, Donald; Subramanyam, Meena

    2010-01-31

    In recent years there has been growing recognition of the impact of anti-drug or anti-therapeutic antibodies (ADAs, ATAs) on the pharmacokinetic and pharmacodynamic behavior of the drug, which ultimately affects drug exposure and activity. These anti-drug antibodies can also impact safety of the therapeutic by inducing a range of reactions from hypersensitivity to neutralization of the activity of an endogenous protein. Assessments of immunogenicity, therefore, are critically dependent on the bioanalytical method used to test samples, in which a positive versus negative reactivity is determined by a statistically derived cut point based on the distribution of drug naïve samples. For non-normally distributed data, a novel gamma-fitting method for obtaining assay cut points is presented. Non-normal immunogenicity data distributions, which tend to be unimodal and positively skewed, can often be modeled by 3-parameter gamma fits. Under a gamma regime, gamma based cut points were found to be more accurate (closer to their targeted false positive rates) compared to normal or log-normal methods and more precise (smaller standard errors of cut point estimators) compared with the nonparametric percentile method. Under a gamma regime, normal theory based methods for estimating cut points targeting a 5% false positive rate were found in computer simulation experiments to have, on average, false positive rates ranging from 6.2 to 8.3% (or positive biases between +1.2 and +3.3%) with bias decreasing with the magnitude of the gamma shape parameter. The log-normal fits tended, on average, to underestimate false positive rates with negative biases as large a -2.3% with absolute bias decreasing with the shape parameter. These results were consistent with the well known fact that gamma distributions become less skewed and closer to a normal distribution as their shape parameters increase. Inflated false positive rates, especially in a screening assay, shifts the emphasis to confirm test results in a subsequent test (confirmatory assay). On the other hand, deflated false positive rates in the case of screening immunogenicity assays will not meet the minimum 5% false positive target as proposed in the immunogenicity assay guidance white papers. Copyright 2009 Elsevier B.V. All rights reserved.

  11. Mechanism-based model for tumor drug resistance.

    PubMed

    Kuczek, T; Chan, T C

    1992-01-01

    The development of tumor resistance to cytotoxic agents has important implications in the treatment of cancer. If supported by experimental data, mathematical models of resistance can provide useful information on the underlying mechanisms and aid in the design of therapeutic regimens. We report on the development of a model of tumor-growth kinetics based on the assumption that the rates of cell growth in a tumor are normally distributed. We further assumed that the growth rate of each cell is proportional to its rate of total pyrimidine synthesis (de novo plus salvage). Using an ovarian carcinoma cell line (2008) and resistant variants selected for chronic exposure to a pyrimidine antimetabolite, N-phosphonacetyl-L-aspartate (PALA), we derived a simple and specific analytical form describing the growth curves generated in 72 h growth assays. The model assumes that the rate of de novo pyrimidine synthesis, denoted alpha, is shifted down by an amount proportional to the log10 PALA concentration and that cells whose rate of pyrimidine synthesis falls below a critical level, denoted alpha 0, can no longer grow. This is described by the equation: Probability (growth) = probability (alpha 0 less than alpha-constant x log10 [PALA]). This model predicts that when growth curves are plotted on probit paper, they will produce straight lines. This prediction is in agreement with the data we obtained for the 2008 cells. Another prediction of this model is that the same probit plots for the resistant variants should shift to the right in a parallel fashion. Probit plots of the dose-response data obtained for each resistant 2008 line following chronic exposure to PALA again confirmed this prediction. Correlation of the rightward shift of dose responses to uridine transport (r = 0.99) also suggests that salvage metabolism plays a key role in tumor-cell resistance to PALA. Furthermore, the slope of the regression lines enables the detection of synergy such as that observed between dipyridamole and PALA. Although the rate-normal model was used to study the rate of salvage metabolism in PALA resistance in the present study, it may be widely applicable to modeling of other resistance mechanisms such as gene amplification of target enzymes.

  12. Drought forecasting in Luanhe River basin involving climatic indices

    NASA Astrophysics Data System (ADS)

    Ren, Weinan; Wang, Yixuan; Li, Jianzhu; Feng, Ping; Smith, Ronald J.

    2017-11-01

    Drought is regarded as one of the most severe natural disasters globally. This is especially the case in Tianjin City, Northern China, where drought can affect economic development and people's livelihoods. Drought forecasting, the basis of drought management, is an important mitigation strategy. In this paper, we evolve a probabilistic forecasting model, which forecasts transition probabilities from a current Standardized Precipitation Index (SPI) value to a future SPI class, based on conditional distribution of multivariate normal distribution to involve two large-scale climatic indices at the same time, and apply the forecasting model to 26 rain gauges in the Luanhe River basin in North China. The establishment of the model and the derivation of the SPI are based on the hypothesis of aggregated monthly precipitation that is normally distributed. Pearson correlation and Shapiro-Wilk normality tests are used to select appropriate SPI time scale and large-scale climatic indices. Findings indicated that longer-term aggregated monthly precipitation, in general, was more likely to be considered normally distributed and forecasting models should be applied to each gauge, respectively, rather than to the whole basin. Taking Liying Gauge as an example, we illustrate the impact of the SPI time scale and lead time on transition probabilities. Then, the controlled climatic indices of every gauge are selected by Pearson correlation test and the multivariate normality of SPI, corresponding climatic indices for current month and SPI 1, 2, and 3 months later are demonstrated using Shapiro-Wilk normality test. Subsequently, we illustrate the impact of large-scale oceanic-atmospheric circulation patterns on transition probabilities. Finally, we use a score method to evaluate and compare the performance of the three forecasting models and compare them with two traditional models which forecast transition probabilities from a current to a future SPI class. The results show that the three proposed models outperform the two traditional models and involving large-scale climatic indices can improve the forecasting accuracy.

  13. Usefulness of the novel risk estimation software, Heart Risk View, for the prediction of cardiac events in patients with normal myocardial perfusion SPECT.

    PubMed

    Sakatani, Tomohiko; Shimoo, Satoshi; Takamatsu, Kazuaki; Kyodo, Atsushi; Tsuji, Yumika; Mera, Kayoko; Koide, Masahiro; Isodono, Koji; Tsubakimoto, Yoshinori; Matsuo, Akiko; Inoue, Keiji; Fujita, Hiroshi

    2016-12-01

    Myocardial perfusion single-photon emission-computed tomography (SPECT) can predict cardiac events in patients with coronary artery disease with high accuracy; however, pseudo-negative cases sometimes occur. Heart Risk View, which is based on the prospective cohort study (J-ACCESS), is a software for evaluating cardiac event probability. We examined whether Heart Risk View was useful to evaluate the cardiac risk in patients with normal myocardial perfusion SPECT (MPS). We studied 3461 consecutive patients who underwent MPS to detect myocardial ischemia and those who had normal MPS were enrolled in this study (n = 698). We calculated cardiac event probability by Heart Risk View and followed-up for 3.8 ± 2.4 years. The cardiac events were defined as cardiac death, non-fatal myocardial infarction, and heart failure requiring hospitalization. During the follow-up period, 21 patients (3.0 %) had cardiac events. The event probability calculated by Heart Risk View was higher in the event group (5.5 ± 2.6 vs. 2.9 ± 2.6 %, p < 0.001). According to the receiver-operating characteristics curve, the cut-off point of the event probability for predicting cardiac events was 3.4 % (sensitivity 0.76, specificity 0.72, and AUC 0.85). Kaplan-Meier curves revealed that a higher event rate was observed in the high-event probability group by the log-rank test (p < 0.001). Although myocardial perfusion SPECT is useful for the prediction of cardiac events, risk estimation by Heart Risk View adds more prognostic information, especially in patients with normal MPS.

  14. Confidence Intervals for True Scores Using the Skew-Normal Distribution

    ERIC Educational Resources Information Center

    Garcia-Perez, Miguel A.

    2010-01-01

    A recent comparative analysis of alternative interval estimation approaches and procedures has shown that confidence intervals (CIs) for true raw scores determined with the Score method--which uses the normal approximation to the binomial distribution--have actual coverage probabilities that are closest to their nominal level. It has also recently…

  15. A comparison of methods to handle skew distributed cost variables in the analysis of the resource consumption in schizophrenia treatment.

    PubMed

    Kilian, Reinhold; Matschinger, Herbert; Löeffler, Walter; Roick, Christiane; Angermeyer, Matthias C

    2002-03-01

    Transformation of the dependent cost variable is often used to solve the problems of heteroscedasticity and skewness in linear ordinary least square regression of health service cost data. However, transformation may cause difficulties in the interpretation of regression coefficients and the retransformation of predicted values. The study compares the advantages and disadvantages of different methods to estimate regression based cost functions using data on the annual costs of schizophrenia treatment. Annual costs of psychiatric service use and clinical and socio-demographic characteristics of the patients were assessed for a sample of 254 patients with a diagnosis of schizophrenia (ICD-10 F 20.0) living in Leipzig. The clinical characteristics of the participants were assessed by means of the BPRS 4.0, the GAF, and the CAN for service needs. Quality of life was measured by WHOQOL-BREF. A linear OLS regression model with non-parametric standard errors, a log-transformed OLS model and a generalized linear model with a log-link and a gamma distribution were used to estimate service costs. For the estimation of robust non-parametric standard errors, the variance estimator by White and a bootstrap estimator based on 2000 replications were employed. Models were evaluated by the comparison of the R2 and the root mean squared error (RMSE). RMSE of the log-transformed OLS model was computed with three different methods of bias-correction. The 95% confidence intervals for the differences between the RMSE were computed by means of bootstrapping. A split-sample-cross-validation procedure was used to forecast the costs for the one half of the sample on the basis of a regression equation computed for the other half of the sample. All three methods showed significant positive influences of psychiatric symptoms and met psychiatric service needs on service costs. Only the log- transformed OLS model showed a significant negative impact of age, and only the GLM shows a significant negative influences of employment status and partnership on costs. All three models provided a R2 of about.31. The Residuals of the linear OLS model revealed significant deviances from normality and homoscedasticity. The residuals of the log-transformed model are normally distributed but still heteroscedastic. The linear OLS model provided the lowest prediction error and the best forecast of the dependent cost variable. The log-transformed model provided the lowest RMSE if the heteroscedastic bias correction was used. The RMSE of the GLM with a log link and a gamma distribution was higher than those of the linear OLS model and the log-transformed OLS model. The difference between the RMSE of the linear OLS model and that of the log-transformed OLS model without bias correction was significant at the 95% level. As result of the cross-validation procedure, the linear OLS model provided the lowest RMSE followed by the log-transformed OLS model with a heteroscedastic bias correction. The GLM showed the weakest model fit again. None of the differences between the RMSE resulting form the cross- validation procedure were found to be significant. The comparison of the fit indices of the different regression models revealed that the linear OLS model provided a better fit than the log-transformed model and the GLM, but the differences between the models RMSE were not significant. Due to the small number of cases in the study the lack of significance does not sufficiently proof that the differences between the RSME for the different models are zero and the superiority of the linear OLS model can not be generalized. The lack of significant differences among the alternative estimators may reflect a lack of sample size adequate to detect important differences among the estimators employed. Further studies with larger case number are necessary to confirm the results. Specification of an adequate regression models requires a careful examination of the characteristics of the data. Estimation of standard errors and confidence intervals by nonparametric methods which are robust against deviations from the normal distribution and the homoscedasticity of residuals are suitable alternatives to the transformation of the skew distributed dependent variable. Further studies with more adequate case numbers are needed to confirm the results.

  16. On the issues of probability distribution of GPS carrier phase observations

    NASA Astrophysics Data System (ADS)

    Luo, X.; Mayer, M.; Heck, B.

    2009-04-01

    In common practice the observables related to Global Positioning System (GPS) are assumed to follow a Gauss-Laplace normal distribution. Actually, full knowledge of the observables' distribution is not required for parameter estimation by means of the least-squares algorithm based on the functional relation between observations and unknown parameters as well as the associated variance-covariance matrix. However, the probability distribution of GPS observations plays a key role in procedures for quality control (e.g. outlier and cycle slips detection, ambiguity resolution) and in reliability-related assessments of the estimation results. Under non-ideal observation conditions with respect to the factors impacting GPS data quality, for example multipath effects and atmospheric delays, the validity of the normal distribution postulate of GPS observations is in doubt. This paper presents a detailed analysis of the distribution properties of GPS carrier phase observations using double difference residuals. For this purpose 1-Hz observation data from the permanent SAPOS

  17. A statistical approach to estimate the 3D size distribution of spheres from 2D size distributions

    USGS Publications Warehouse

    Kong, M.; Bhattacharya, R.N.; James, C.; Basu, A.

    2005-01-01

    Size distribution of rigidly embedded spheres in a groundmass is usually determined from measurements of the radii of the two-dimensional (2D) circular cross sections of the spheres in random flat planes of a sample, such as in thin sections or polished slabs. Several methods have been devised to find a simple factor to convert the mean of such 2D size distributions to the actual 3D mean size of the spheres without a consensus. We derive an entirely theoretical solution based on well-established probability laws and not constrained by limitations of absolute size, which indicates that the ratio of the means of measured 2D and estimated 3D grain size distribution should be r/4 (=.785). Actual 2D size distribution of the radii of submicron sized, pure Fe0 globules in lunar agglutinitic glass, determined from backscattered electron images, is tested to fit the gamma size distribution model better than the log-normal model. Numerical analysis of 2D size distributions of Fe0 globules in 9 lunar soils shows that the average mean of 2D/3D ratio is 0.84, which is very close to the theoretical value. These results converge with the ratio 0.8 that Hughes (1978) determined for millimeter-sized chondrules from empirical measurements. We recommend that a factor of 1.273 (reciprocal of 0.785) be used to convert the determined 2D mean size (radius or diameter) of a population of spheres to estimate their actual 3D size. ?? 2005 Geological Society of America.

  18. Trending in Probability of Collision Measurements via a Bayesian Zero-Inflated Beta Mixed Model

    NASA Technical Reports Server (NTRS)

    Vallejo, Jonathon; Hejduk, Matt; Stamey, James

    2015-01-01

    We investigate the performance of a generalized linear mixed model in predicting the Probabilities of Collision (Pc) for conjunction events. Specifically, we apply this model to the log(sub 10) transformation of these probabilities and argue that this transformation yields values that can be considered bounded in practice. Additionally, this bounded random variable, after scaling, is zero-inflated. Consequently, we model these values using the zero-inflated Beta distribution, and utilize the Bayesian paradigm and the mixed model framework to borrow information from past and current events. This provides a natural way to model the data and provides a basis for answering questions of interest, such as what is the likelihood of observing a probability of collision equal to the effective value of zero on a subsequent observation.

  19. Learning Probabilities From Random Observables in High Dimensions: The Maximum Entropy Distribution and Others

    NASA Astrophysics Data System (ADS)

    Obuchi, Tomoyuki; Cocco, Simona; Monasson, Rémi

    2015-11-01

    We consider the problem of learning a target probability distribution over a set of N binary variables from the knowledge of the expectation values (with this target distribution) of M observables, drawn uniformly at random. The space of all probability distributions compatible with these M expectation values within some fixed accuracy, called version space, is studied. We introduce a biased measure over the version space, which gives a boost increasing exponentially with the entropy of the distributions and with an arbitrary inverse `temperature' Γ . The choice of Γ allows us to interpolate smoothly between the unbiased measure over all distributions in the version space (Γ =0) and the pointwise measure concentrated at the maximum entropy distribution (Γ → ∞ ). Using the replica method we compute the volume of the version space and other quantities of interest, such as the distance R between the target distribution and the center-of-mass distribution over the version space, as functions of α =(log M)/N and Γ for large N. Phase transitions at critical values of α are found, corresponding to qualitative improvements in the learning of the target distribution and to the decrease of the distance R. However, for fixed α the distance R does not vary with Γ which means that the maximum entropy distribution is not closer to the target distribution than any other distribution compatible with the observable values. Our results are confirmed by Monte Carlo sampling of the version space for small system sizes (N≤ 10).

  20. Distributed Storage Algorithm for Geospatial Image Data Based on Data Access Patterns.

    PubMed

    Pan, Shaoming; Li, Yongkai; Xu, Zhengquan; Chong, Yanwen

    2015-01-01

    Declustering techniques are widely used in distributed environments to reduce query response time through parallel I/O by splitting large files into several small blocks and then distributing those blocks among multiple storage nodes. Unfortunately, however, many small geospatial image data files cannot be further split for distributed storage. In this paper, we propose a complete theoretical system for the distributed storage of small geospatial image data files based on mining the access patterns of geospatial image data using their historical access log information. First, an algorithm is developed to construct an access correlation matrix based on the analysis of the log information, which reveals the patterns of access to the geospatial image data. Then, a practical heuristic algorithm is developed to determine a reasonable solution based on the access correlation matrix. Finally, a number of comparative experiments are presented, demonstrating that our algorithm displays a higher total parallel access probability than those of other algorithms by approximately 10-15% and that the performance can be further improved by more than 20% by simultaneously applying a copy storage strategy. These experiments show that the algorithm can be applied in distributed environments to help realize parallel I/O and thereby improve system performance.

  1. Understanding star formation in molecular clouds. II. Signatures of gravitational collapse of IRDCs

    NASA Astrophysics Data System (ADS)

    Schneider, N.; Csengeri, T.; Klessen, R. S.; Tremblin, P.; Ossenkopf, V.; Peretto, N.; Simon, R.; Bontemps, S.; Federrath, C.

    2015-06-01

    We analyse column density and temperature maps derived from Herschel dust continuum observations of a sample of prominent, massive infrared dark clouds (IRDCs) i.e. G11.11-0.12, G18.82-0.28, G28.37+0.07, and G28.53-0.25. We disentangle the velocity structure of the clouds using 13CO 1→0 and 12CO 3→2 data, showing that these IRDCs are the densest regions in massive giant molecular clouds (GMCs) and not isolated features. The probability distribution function (PDF) of column densities for all clouds have a power-law distribution over all (high) column densities, regardless of the evolutionary stage of the cloud: G11.11-0.12, G18.82-0.28, and G28.37+0.07 contain (proto)-stars, while G28.53-0.25 shows no signs of star formation. This is in contrast to the purely log-normal PDFs reported for near and/or mid-IR extinction maps. We only find a log-normal distribution for lower column densities, if we perform PDFs of the column density maps of the whole GMC in which the IRDCs are embedded. By comparing the PDF slope and the radial column density profile of three of our clouds, we attribute the power law to the effect of large-scale gravitational collapse and to local free-fall collapse of pre- and protostellar cores for the highest column densities. A significant impact on the cloud properties from radiative feedback is unlikely because the clouds are mostly devoid of star formation. Independent from the PDF analysis, we find infall signatures in the spectral profiles of 12CO for G28.37+0.07 and G11.11-0.12, supporting the scenario of gravitational collapse. Our results are in line with earlier interpretations that see massive IRDCs as the densest regions within GMCs, which may be the progenitors of massive stars or clusters. At least some of the IRDCs are probably the same features as ridges (high column density regions with N> 1023 cm-2 over small areas), which were defined for nearby IR-bright GMCs. Because IRDCs are only confined to the densest (gravity dominated) cloud regions, the PDF constructed from this kind of a clipped image does not represent the (turbulence dominated) low column density regime of the cloud. The column density maps (FITS files) are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/578/A29

  2. Performance of synchronous optical receivers using atmospheric compensation techniques.

    PubMed

    Belmonte, Aniceto; Khan, Joseph

    2008-09-01

    We model the impact of atmospheric turbulence-induced phase and amplitude fluctuations on free-space optical links using synchronous detection. We derive exact expressions for the probability density function of the signal-to-noise ratio in the presence of turbulence. We consider the effects of log-normal amplitude fluctuations and Gaussian phase fluctuations, in addition to local oscillator shot noise, for both passive receivers and those employing active modal compensation of wave-front phase distortion. We compute error probabilities for M-ary phase-shift keying, and evaluate the impact of various parameters, including the ratio of receiver aperture diameter to the wave-front coherence diameter, and the number of modes compensated.

  3. The Universal Statistical Distributions of the Affinity, Equilibrium Constants, Kinetics and Specificity in Biomolecular Recognition

    PubMed Central

    Zheng, Xiliang; Wang, Jin

    2015-01-01

    We uncovered the universal statistical laws for the biomolecular recognition/binding process. We quantified the statistical energy landscapes for binding, from which we can characterize the distributions of the binding free energy (affinity), the equilibrium constants, the kinetics and the specificity by exploring the different ligands binding with a particular receptor. The results of the analytical studies are confirmed by the microscopic flexible docking simulations. The distribution of binding affinity is Gaussian around the mean and becomes exponential near the tail. The equilibrium constants of the binding follow a log-normal distribution around the mean and a power law distribution in the tail. The intrinsic specificity for biomolecular recognition measures the degree of discrimination of native versus non-native binding and the optimization of which becomes the maximization of the ratio of the free energy gap between the native state and the average of non-native states versus the roughness measured by the variance of the free energy landscape around its mean. The intrinsic specificity obeys a Gaussian distribution near the mean and an exponential distribution near the tail. Furthermore, the kinetics of binding follows a log-normal distribution near the mean and a power law distribution at the tail. Our study provides new insights into the statistical nature of thermodynamics, kinetics and function from different ligands binding with a specific receptor or equivalently specific ligand binding with different receptors. The elucidation of distributions of the kinetics and free energy has guiding roles in studying biomolecular recognition and function through small-molecule evolution and chemical genetics. PMID:25885453

  4. ELISPOTs Produced by CD8 and CD4 Cells Follow Log Normal Size Distribution Permitting Objective Counting

    PubMed Central

    Karulin, Alexey Y.; Karacsony, Kinga; Zhang, Wenji; Targoni, Oleg S.; Moldovan, Ioana; Dittrich, Marcus; Sundararaman, Srividya; Lehmann, Paul V.

    2015-01-01

    Each positive well in ELISPOT assays contains spots of variable sizes that can range from tens of micrometers up to a millimeter in diameter. Therefore, when it comes to counting these spots the decision on setting the lower and the upper spot size thresholds to discriminate between non-specific background noise, spots produced by individual T cells, and spots formed by T cell clusters is critical. If the spot sizes follow a known statistical distribution, precise predictions on minimal and maximal spot sizes, belonging to a given T cell population, can be made. We studied the size distributional properties of IFN-γ, IL-2, IL-4, IL-5 and IL-17 spots elicited in ELISPOT assays with PBMC from 172 healthy donors, upon stimulation with 32 individual viral peptides representing defined HLA Class I-restricted epitopes for CD8 cells, and with protein antigens of CMV and EBV activating CD4 cells. A total of 334 CD8 and 80 CD4 positive T cell responses were analyzed. In 99.7% of the test cases, spot size distributions followed Log Normal function. These data formally demonstrate that it is possible to establish objective, statistically validated parameters for counting T cell ELISPOTs. PMID:25612115

  5. A cross-site comparison of methods used for hydrogeologic characterization of the Galena-Platteville aquifer in Illinois and Wisconsin, with examples from selected Superfund sites

    USGS Publications Warehouse

    Kay, Robert T.; Mills, Patrick C.; Dunning, Charles P.; Yeskis, Douglas J.; Ursic, James R.; Vendl, Mark

    2004-01-01

    The effectiveness of 28 methods used to characterize the fractured Galena-Platteville aquifer at eight sites in northern Illinois and Wisconsin is evaluated. Analysis of government databases, previous investigations, topographic maps, aerial photographs, and outcrops was essential to understanding the hydrogeology in the area to be investigated. The effectiveness of surface-geophysical methods depended on site geology. Lithologic logging provided essential information for site characterization. Cores were used for stratigraphy and geotechnical analysis. Natural-gamma logging helped identify the effect of lithology on the location of secondary- permeability features. Caliper logging identified large secondary-permeability features. Neutron logs identified trends in matrix porosity. Acoustic-televiewer logs identified numerous secondary-permeability features and their orientation. Borehole-camera logs also identified a number of secondary-permeability features. Borehole ground-penetrating radar identified lithologic and secondary-permeability features. However, the accuracy and completeness of this method is uncertain. Single-point-resistance, density, and normal resistivity logs were of limited use. Water-level and water-quality data identified flow directions and indicated the horizontal and vertical distribution of aquifer permeability and the depth of the permeable features. Temperature, spontaneous potential, and fluid-resistivity logging identified few secondary-permeability features at some sites and several features at others. Flowmeter logging was the most effective geophysical method for characterizing secondary-permeability features. Aquifer tests provided insight into the permeability distribution, identified hydraulically interconnected features, the presence of heterogeneity and anisotropy, and determined effective porosity. Aquifer heterogeneity prevented calculation of accurate hydraulic properties from some tests. Different methods, such as flowmeter logging and slug testing, occasionally produced different interpretations. Aquifer characterization improved with an increase in the number of data points, the period of data collection, and the number of methods used.

  6. Finite element model updating using the shadow hybrid Monte Carlo technique

    NASA Astrophysics Data System (ADS)

    Boulkaibet, I.; Mthembu, L.; Marwala, T.; Friswell, M. I.; Adhikari, S.

    2015-02-01

    Recent research in the field of finite element model updating (FEM) advocates the adoption of Bayesian analysis techniques to dealing with the uncertainties associated with these models. However, Bayesian formulations require the evaluation of the Posterior Distribution Function which may not be available in analytical form. This is the case in FEM updating. In such cases sampling methods can provide good approximations of the Posterior distribution when implemented in the Bayesian context. Markov Chain Monte Carlo (MCMC) algorithms are the most popular sampling tools used to sample probability distributions. However, the efficiency of these algorithms is affected by the complexity of the systems (the size of the parameter space). The Hybrid Monte Carlo (HMC) offers a very important MCMC approach to dealing with higher-dimensional complex problems. The HMC uses the molecular dynamics (MD) steps as the global Monte Carlo (MC) moves to reach areas of high probability where the gradient of the log-density of the Posterior acts as a guide during the search process. However, the acceptance rate of HMC is sensitive to the system size as well as the time step used to evaluate the MD trajectory. To overcome this limitation we propose the use of the Shadow Hybrid Monte Carlo (SHMC) algorithm. The SHMC algorithm is a modified version of the Hybrid Monte Carlo (HMC) and designed to improve sampling for large-system sizes and time steps. This is done by sampling from a modified Hamiltonian function instead of the normal Hamiltonian function. In this paper, the efficiency and accuracy of the SHMC method is tested on the updating of two real structures; an unsymmetrical H-shaped beam structure and a GARTEUR SM-AG19 structure and is compared to the application of the HMC algorithm on the same structures.

  7. Random Partition Distribution Indexed by Pairwise Information

    PubMed Central

    Dahl, David B.; Day, Ryan; Tsai, Jerry W.

    2017-01-01

    We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g., hierarchical clustering), but we show how to use this type of information to define a prior partition distribution for flexible Bayesian modeling. A defining feature of the distribution is that it allocates probability among partitions within a given number of subsets, but it does not shift probability among sets of partitions with different numbers of subsets. Our distribution places more probability on partitions that group similar items yet keeps the total probability of partitions with a given number of subsets constant. The distribution of the number of subsets (and its moments) is available in closed-form and is not a function of the similarities. Our formulation has an explicit probability mass function (with a tractable normalizing constant) so the full suite of MCMC methods may be used for posterior inference. We compare our distribution with several existing partition distributions, showing that our formulation has attractive properties. We provide three demonstrations to highlight the features and relative performance of our distribution. PMID:29276318

  8. Transmitting Information by Propagation in an Ocean Waveguide: Computation of Acoustic Field Capacity

    DTIC Science & Technology

    2015-06-17

    progress, Eq. (4) is evaluated in terms of the differential entropy h. The integrals can be identified as differential entropy terms by expanding the log...all ran- dom vectors p with a given covariance matrix, the entropy of p is maximized when p is ZMCSCG since a normal distribution maximizes the... entropy over all distributions with the same covariance [9, 18], implying that this is the optimal distribution on s as well. In addition, of all the

  9. The influence of local majority opinions on the dynamics of the Sznajd model

    NASA Astrophysics Data System (ADS)

    Crokidakis, Nuno

    2014-03-01

    In this work we study a Sznajd-like opinion dynamics on a square lattice of linear size L. For this purpose, we consider that each agent has a convincing power C, that is a time-dependent quantity. Each high convincing power group of four agents sharing the same opinion may convince its neighbors to follow the group opinion, which induces an increase of the group's convincing power. In addition, we have considered that a group with a local majority opinion (3 up/1 down spins or 1 up/3 down spins) can persuade the agents neighboring the group with probability p, since the group's convincing power is high enough. The two mechanisms (convincing powers and probability p) lead to an increase of the competition among the opinions, which avoids dictatorship (full consensus, all spins parallel) for a wide range of model's parameters, and favors the occurrence of democratic states (partial order, the majority of spins pointing in one direction). We have found that the relaxation times of the model follow log-normal distributions, and that the average relaxation time τ grows with system size as τ ~ L5/2, independent of p. We also discuss the occurrence of the usual phase transition of the Sznajd model.

  10. Oil & Natural Gas Technology A new approach to understanding the occurrence and volume of natural gas hydrate in the northern Gulf of Mexico using petroleum industry well logs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cook, Ann; Majumdar, Urmi

    The northern Gulf of Mexico has been the target for the petroleum industry for exploration of conventional energy resource for decades. We have used the rich existing petroleum industry well logs to find the occurrences of natural gas hydrate in the northern Gulf of Mexico. We have identified 798 wells with well log data within the gas hydrate stability zone. Out of those 798 wells, we have found evidence of gas hydrate in well logs in 124 wells (15% of wells). We have built a dataset of gas hydrate providing information such as location, interval of hydrate occurrence (if any)more » and the overall quality of probable gas hydrate. Our dataset provides a wide, new perspective on the overall distribution of gas hydrate in the northern Gulf of Mexico and will be the key to future gas hydrate research and prospecting in the area.« less

  11. Assessment of Methane Emissions from Oil and Gas Production Pads using Mobile Measurements

    EPA Science Inventory

    Journal Article Abstract --- "A mobile source inspection approach called OTM 33A was used to quantify short-term methane emission rates from 218 oil and gas production pads in Texas, Colorado, and Wyoming from 2010 to 2013. The emission rates were log-normally distributed with ...

  12. Evaluation of waste mushroom logs as a potential biomass resource for the production of bioethanol.

    PubMed

    Lee, Jae-Won; Koo, Bon-Wook; Choi, Joon-Weon; Choi, Don-Ha; Choi, In-Gyu

    2008-05-01

    In order to investigate the possibility of using waste mushroom logs as a biomass resource for alternative energy production, the chemical and physical characteristics of normal wood and waste mushroom logs were examined. Size reduction of normal wood (145 kW h/tone) required significantly higher energy consumption than waste mushroom logs (70 kW h/tone). The crystallinity value of waste mushroom logs was dramatically lower (33%) than normal wood (49%) after cultivation by Lentinus edodes as spawn. Lignin, an enzymatic hydrolysis inhibitor in sugar production, decreased from 21.07% to 18.78% after inoculation of L. edodes. Total sugar yields obtained by enzyme and acid hydrolysis were higher in waste mushroom logs than in normal wood. After 24h fermentation, 12 g/L ethanol was produced on waste mushroom logs, while normal wood produced 8 g/L ethanol. These results indicate that waste mushroom logs are economically suitable lignocellulosic material for the production of fermentable sugars related to bioethanol production.

  13. Quantitative Microbial Risk Assessment for Clostridium perfringens in Natural and Processed Cheeses

    PubMed Central

    Lee, Heeyoung; Lee, Soomin; Kim, Sejeong; Lee, Jeeyeon; Ha, Jimyeong; Yoon, Yohan

    2016-01-01

    This study evaluated the risk of Clostridium perfringens (C. perfringens) foodborne illness from natural and processed cheeses. Microbial risk assessment in this study was conducted according to four steps: hazard identification, hazard characterization, exposure assessment, and risk characterization. The hazard identification of C. perfringens on cheese was identified through literature, and dose response models were utilized for hazard characterization of the pathogen. For exposure assessment, the prevalence of C. perfringens, storage temperatures, storage time, and annual amounts of cheese consumption were surveyed. Eventually, a simulation model was developed using the collected data and the simulation result was used to estimate the probability of C. perfringens foodborne illness by cheese consumption with @RISK. C. perfringens was determined to be low risk on cheese based on hazard identification, and the exponential model (r = 1.82×10−11) was deemed appropriate for hazard characterization. Annual amounts of natural and processed cheese consumption were 12.40±19.43 g and 19.46±14.39 g, respectively. Since the contamination levels of C. perfringens on natural (0.30 Log CFU/g) and processed cheeses (0.45 Log CFU/g) were below the detection limit, the initial contamination levels of natural and processed cheeses were estimated by beta distribution (α1 = 1, α2 = 91; α1 = 1, α2 = 309)×uniform distribution (a = 0, b = 2; a = 0, b = 2.8) to be −2.35 and −2.73 Log CFU/g, respectively. Moreover, no growth of C. perfringens was observed for exposure assessment to simulated conditions of distribution and storage. These data were used for risk characterization by a simulation model, and the mean values of the probability of C. perfringens foodborne illness by cheese consumption per person per day for natural and processed cheeses were 9.57×10−14 and 3.58×10−14, respectively. These results indicate that probability of C. perfringens foodborne illness by consumption cheese is low, and it can be used to establish microbial criteria for C. perfringens on natural and processed cheeses. PMID:26954204

  14. Exact Scheffé-type confidence intervals for output from groundwater flow models: 1. Use of hydrogeologic information

    USGS Publications Warehouse

    Cooley, Richard L.

    1993-01-01

    A new method is developed to efficiently compute exact Scheffé-type confidence intervals for output (or other function of parameters) g(β) derived from a groundwater flow model. The method is general in that parameter uncertainty can be specified by any statistical distribution having a log probability density function (log pdf) that can be expanded in a Taylor series. However, for this study parameter uncertainty is specified by a statistical multivariate beta distribution that incorporates hydrogeologic information in the form of the investigator's best estimates of parameters and a grouping of random variables representing possible parameter values so that each group is defined by maximum and minimum bounds and an ordering according to increasing value. The new method forms the confidence intervals from maximum and minimum limits of g(β) on a contour of a linear combination of (1) the quadratic form for the parameters used by Cooley and Vecchia (1987) and (2) the log pdf for the multivariate beta distribution. Three example problems are used to compare characteristics of the confidence intervals for hydraulic head obtained using different weights for the linear combination. Different weights generally produced similar confidence intervals, whereas the method of Cooley and Vecchia (1987) often produced much larger confidence intervals.

  15. High throughput nonparametric probability density estimation.

    PubMed

    Farmer, Jenny; Jacobs, Donald

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.

  16. High throughput nonparametric probability density estimation

    PubMed Central

    Farmer, Jenny

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803

  17. The word frequency effect during sentence reading: A linear or nonlinear effect of log frequency?

    PubMed

    White, Sarah J; Drieghe, Denis; Liversedge, Simon P; Staub, Adrian

    2016-10-20

    The effect of word frequency on eye movement behaviour during reading has been reported in many experimental studies. However, the vast majority of these studies compared only two levels of word frequency (high and low). Here we assess whether the effect of log word frequency on eye movement measures is linear, in an experiment in which a critical target word in each sentence was at one of three approximately equally spaced log frequency levels. Separate analyses treated log frequency as a categorical or a continuous predictor. Both analyses showed only a linear effect of log frequency on the likelihood of skipping a word, and on first fixation duration. Ex-Gaussian analyses of first fixation duration showed similar effects on distributional parameters in comparing high- and medium-frequency words, and medium- and low-frequency words. Analyses of gaze duration and the probability of a refixation suggested a nonlinear pattern, with a larger effect at the lower end of the log frequency scale. However, the nonlinear effects were small, and Bayes Factor analyses favoured the simpler linear models for all measures. The possible roles of lexical and post-lexical factors in producing nonlinear effects of log word frequency during sentence reading are discussed.

  18. Multinomial Logistic Regression & Bootstrapping for Bayesian Estimation of Vertical Facies Prediction in Heterogeneous Sandstone Reservoirs

    NASA Astrophysics Data System (ADS)

    Al-Mudhafar, W. J.

    2013-12-01

    Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.

  19. Stochastic analysis of the efficiency of coupled hydraulic-physical barriers to contain solute plumes in highly heterogeneous aquifers

    NASA Astrophysics Data System (ADS)

    Pedretti, Daniele; Masetti, Marco; Beretta, Giovanni Pietro

    2017-10-01

    The expected long-term efficiency of vertical cutoff walls coupled to pump-and-treat technologies to contain solute plumes in highly heterogeneous aquifers was analyzed. A well-characterized case study in Italy, with a hydrogeological database of 471 results from hydraulic tests performed on the aquifer and the surrounding 2-km-long cement-bentonite (CB) walls, was used to build a conceptual model and assess a representative remediation site adopting coupled technologies. In the studied area, the aquifer hydraulic conductivity Ka [m/d] is log-normally distributed with mean E (Ya) = 0.32 , variance σYa2 = 6.36 (Ya = lnKa) and spatial correlation well described by an exponential isotropic variogram with integral scale less than 1/12 the domain size. The hardened CB wall's hydraulic conductivity, Kw [m/d], displayed strong scaling effects and a lognormal distribution with mean E (Yw) = - 3.43 and σYw2 = 0.53 (Yw =log10Kw). No spatial correlation of Kw was detected. Using this information, conservative transport was simulated across a CB wall in spatially correlated 1-D random Ya fields within a numerical Monte Carlo framework. Multiple scenarios representing different Kw values were tested. A continuous solute source with known concentration and deterministic drains' discharge rates were assumed. The efficiency of the confining system was measured by the probability of exceedance of concentration over a threshold (C∗) at a control section 10 years after the initial solute release. It was found that the stronger the aquifer heterogeneity, the higher the expected efficiency of the confinement system and the lower the likelihood of aquifer pollution. This behavior can be explained because, for the analyzed aquifer conditions, a lower Ka generates more pronounced drawdown in the water table in the proximity of the drain and consequently a higher advective flux towards the confined area, which counteracts diffusive fluxes across the walls. Thus, a higher σYa2 results in a larger amount of low Ka values in the proximity of the drain, and a higher probability of not exceeding C∗ .

  20. On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.

    PubMed

    Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L

    2011-10-01

    Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.

  1. A log-normal distribution model for the molecular weight of aquatic fulvic acids

    USGS Publications Warehouse

    Cabaniss, S.E.; Zhou, Q.; Maurice, P.A.; Chin, Y.-P.; Aiken, G.R.

    2000-01-01

    The molecular weight of humic substances influences their proton and metal binding, organic pollutant partitioning, adsorption onto minerals and activated carbon, and behavior during water treatment. We propose a lognormal model for the molecular weight distribution in aquatic fulvic acids to provide a conceptual framework for studying these size effects. The normal curve mean and standard deviation are readily calculated from measured M(n) and M(w) and vary from 2.7 to 3 for the means and from 0.28 to 0.37 for the standard deviations for typical aquatic fulvic acids. The model is consistent with several types of molecular weight data, including the shapes of high- pressure size-exclusion chromatography (HP-SEC) peaks. Applications of the model to electrostatic interactions, pollutant solubilization, and adsorption are explored in illustrative calculations.The molecular weight of humic substances influences their proton and metal binding, organic pollutant partitioning, adsorption onto minerals and activated carbon, and behavior during water treatment. We propose a log-normal model for the molecular weight distribution in aquatic fulvic acids to provide a conceptual framework for studying these size effects. The normal curve mean and standard deviation are readily calculated from measured Mn and Mw and vary from 2.7 to 3 for the means and from 0.28 to 0.37 for the standard deviations for typical aquatic fulvic acids. The model is consistent with several type's of molecular weight data, including the shapes of high-pressure size-exclusion chromatography (HP-SEC) peaks. Applications of the model to electrostatic interactions, pollutant solubilization, and adsorption are explored in illustrative calculations.

  2. Type I error rates of rare single nucleotide variants are inflated in tests of association with non-normally distributed traits using simple linear regression methods.

    PubMed

    Schwantes-An, Tae-Hwi; Sung, Heejong; Sabourin, Jeremy A; Justice, Cristina M; Sorant, Alexa J M; Wilson, Alexander F

    2016-01-01

    In this study, the effects of (a) the minor allele frequency of the single nucleotide variant (SNV), (b) the degree of departure from normality of the trait, and (c) the position of the SNVs on type I error rates were investigated in the Genetic Analysis Workshop (GAW) 19 whole exome sequence data. To test the distribution of the type I error rate, 5 simulated traits were considered: standard normal and gamma distributed traits; 2 transformed versions of the gamma trait (log 10 and rank-based inverse normal transformations); and trait Q1 provided by GAW 19. Each trait was tested with 313,340 SNVs. Tests of association were performed with simple linear regression and average type I error rates were determined for minor allele frequency classes. Rare SNVs (minor allele frequency < 0.05) showed inflated type I error rates for non-normally distributed traits that increased as the minor allele frequency decreased. The inflation of average type I error rates increased as the significance threshold decreased. Normally distributed traits did not show inflated type I error rates with respect to the minor allele frequency for rare SNVs. There was no consistent effect of transformation on the uniformity of the distribution of the location of SNVs with a type I error.

  3. Mixed effects modelling for glass category estimation from glass refractive indices.

    PubMed

    Lucy, David; Zadora, Grzegorz

    2011-10-10

    520 Glass fragments were taken from 105 glass items. Each item was either a container, a window, or glass from an automobile. Each of these three classes of use are defined as glass categories. Refractive indexes were measured both before, and after a programme of re-annealing. Because the refractive index of each fragment could not in itself be observed before and after re-annealing, a model based approach was used to estimate the change in refractive index for each glass category. It was found that less complex estimation methods would be equivalent to the full model, and were subsequently used. The change in refractive index was then used to calculate a measure of the evidential value for each item belonging to each glass category. The distributions of refractive index change were considered for each glass category, and it was found that, possibly due to small samples, members of the normal family would not adequately model the refractive index changes within two of the use types considered here. Two alternative approaches to modelling the change in refractive index were used, one employed more established kernel density estimates, the other a newer approach called log-concave estimation. Either method when applied to the change in refractive index was found to give good estimates of glass category, however, on all performance metrics kernel density estimates were found to be slightly better than log-concave estimates, although the estimates from log-concave estimation prossessed properties which had some qualitative appeal not encapsulated in the selected measures of performance. These results and implications of these two methods of estimating probability densities for glass refractive indexes are discussed. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  4. Role of Demographic Dynamics and Conflict in the Population-Area Relationship for Human Languages

    PubMed Central

    Manrubia, Susanna C.; Axelsen, Jacob B.; Zanette, Damián H.

    2012-01-01

    Many patterns displayed by the distribution of human linguistic groups are similar to the ecological organization described for biological species. It remains a challenge to identify simple and meaningful processes that describe these patterns. The population size distribution of human linguistic groups, for example, is well fitted by a log-normal distribution that may arise from stochastic demographic processes. As we show in this contribution, the distribution of the area size of home ranges of those groups also agrees with a log-normal function. Further, size and area are significantly correlated: the number of speakers and the area spanned by linguistic groups follow the allometric relation , with an exponent varying accross different world regions. The empirical evidence presented leads to the hypothesis that the distributions of and , and their mutual dependence, rely on demographic dynamics and on the result of conflicts over territory due to group growth. To substantiate this point, we introduce a two-variable stochastic multiplicative model whose analytical solution recovers the empirical observations. Applied to different world regions, the model reveals that the retreat in home range is sublinear with respect to the decrease in population size, and that the population-area exponent grows with the typical strength of conflicts. While the shape of the population size and area distributions, and their allometric relation, seem unavoidable outcomes of demography and inter-group contact, the precise value of could give insight on the cultural organization of those human groups in the last thousand years. PMID:22815726

  5. On the distribution of scaling hydraulic parameters in a spatially anisotropic banana field

    NASA Astrophysics Data System (ADS)

    Regalado, Carlos M.

    2005-06-01

    When modeling soil hydraulic properties at field scale it is desirable to approximate the variability in a given area by means of some scaling transformations which relate spatially variable local hydraulic properties to global reference characteristics. Seventy soil cores were sampled within a drip irrigated banana plantation greenhouse on a 14×5 array of 2.5 m×5 m rectangles at 15 cm depth, to represent the field scale variability of flow related properties. Saturated hydraulic conductivity and water retention characteristics were measured in these 70 soil cores. van Genuchten water retention curves (WRC) with optimized m ( m≠1-1/ n) were fitted to the WR data and a general Mualem-van Genuchten model was used to predict hydraulic conductivity functions for each soil core. A scaling law, of the form ν=ανi*, was fitted to soil hydraulic data, such that the original hydraulic parameters νi were scaled down to a reference curve with parameters νi*. An analytical expression, in terms of Beta functions, for the average suction value, hc, necessary to apply the above scaling method, was obtained. A robust optimization procedure with fast convergence to the global minimum is used to find the optimum hc, such that dispersion is minimized in the scaled data set. Via the Box-Cox transformation P(τ)=(αiτ-1)/τ, Box-Cox normality plots showed that scaling factors for the suction ( αh) and hydraulic conductivity ( αk) were approximately log-normally distributed (i.e. τ=0), as it would be expected for such dynamic properties involving flow. By contrast static soil related properties as αθ were found closely Gaussian, although a power τ=3/4 was best for approaching normality. Application of four different normality tests (Anderson-Darling, Shapiro-Wilk, Kolmogorov-Smirnov and χ2 goodness-of-fit tests) rendered some contradictory results among them, thus suggesting that this widely extended practice is not recommended for providing a suitable probability density function for the scaling parameters, αi. Some indications for the origin of these disagreements, in terms of population size and test constraints, are pointed out. Visual inspection of normal probability plots can also lead to erroneous results. The scaling parameters αθ and αK show a sinusoidal spatial variation coincident with the underlying alignment of banana plants on the field. Such anisotropic distribution is explained in terms of porosity variations due to processes promoting soil degradation as surface desiccation and soil compaction, induced by tillage and localized irrigation of banana plants, and it is quantified by means of cross-correlograms.

  6. Fatigue Shifts and Scatters Heart Rate Variability in Elite Endurance Athletes

    PubMed Central

    Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire

    2013-01-01

    Purpose This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in ‘fatigue’ or in ‘no-fatigue’ state in ‘real life’ conditions. Methods 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms2 and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). Results 172 trials were identified as in a ‘fatigue’ and 891 as in ‘no-fatigue’ state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between ‘fatigue’ and ‘no-fatigue’: HRSU (+6.27±0.61 bpm), logTPSU (−0.36±0.04), logLFSU (−0.27±0.04), logHFSU (−0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (−9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (−0.28±0.03), logLFST (−0.29±0.03), logHFST (−0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the ‘fatigue’ state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). Conclusion HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern. PMID:23951198

  7. Flowering phenology and its implications for management of big-leaf mahogany Swietenia macrophylla in Brazilian Amazonia.

    PubMed

    Grogan, James; Loveless, Marilyn D

    2013-11-01

    Flowering phenology is a crucial determinant of reproductive success and offspring genetic diversity in plants. We measure the flowering phenology of big-leaf mahogany (Swietenia macrophylla, Meliaceae), a widely distributed neotropical tree, and explore how disturbance from logging impacts its reproductive biology. We use a crown scoring system to estimate the timing and duration of population-level flowering at three forest sites in the Brazilian Amazon over a five-year period. We combine this information with data on population structure and spatial distribution to consider the implications of logging for population flowering patterns and reproductive success. Mahogany trees as small as 14 cm diam flowered, but only trees > 30 cm diam flowered annually or supra-annually. Mean observed flowering periods by focal trees ranged from 18-34 d, and trees flowered sequentially during 3-4 mo beginning in the dry season. Focal trees demonstrated significant interannual correlation in flowering order. Estimated population-level flowering schedules resembled that of the focal trees, with temporal isolation between early and late flowering trees. At the principal study site, conventional logging practices eliminated 87% of mahogany trees > 30 cm diam and an estimated 94% of annual pre-logging floral effort. Consistent interannual patterns of sequential flowering among trees create incompletely isolated subpopulations, constraining pollen flow. After harvests, surviving subcommercial trees will have fewer, more distant, and smaller potential partners, with probable consequences for post-logging regeneration. These results have important implications for the sustainability of harvesting systems for tropical timber species.

  8. A new stochastic algorithm for inversion of dust aerosol size distribution

    NASA Astrophysics Data System (ADS)

    Wang, Li; Li, Feng; Yang, Ma-ying

    2015-08-01

    Dust aerosol size distribution is an important source of information about atmospheric aerosols, and it can be determined from multiwavelength extinction measurements. This paper describes a stochastic inverse technique based on artificial bee colony (ABC) algorithm to invert the dust aerosol size distribution by light extinction method. The direct problems for the size distribution of water drop and dust particle, which are the main elements of atmospheric aerosols, are solved by the Mie theory and the Lambert-Beer Law in multispectral region. And then, the parameters of three widely used functions, i.e. the log normal distribution (L-N), the Junge distribution (J-J), and the normal distribution (N-N), which can provide the most useful representation of aerosol size distributions, are inversed by the ABC algorithm in the dependent model. Numerical results show that the ABC algorithm can be successfully applied to recover the aerosol size distribution with high feasibility and reliability even in the presence of random noise.

  9. Prediction of the filtrate particle size distribution from the pore size distribution in membrane filtration: Numerical correlations from computer simulations

    NASA Astrophysics Data System (ADS)

    Marrufo-Hernández, Norma Alejandra; Hernández-Guerrero, Maribel; Nápoles-Duarte, José Manuel; Palomares-Báez, Juan Pedro; Chávez-Rojo, Marco Antonio

    2018-03-01

    We present a computational model that describes the diffusion of a hard spheres colloidal fluid through a membrane. The membrane matrix is modeled as a series of flat parallel planes with circular pores of different sizes and random spatial distribution. This model was employed to determine how the size distribution of the colloidal filtrate depends on the size distributions of both, the particles in the feed and the pores of the membrane, as well as to describe the filtration kinetics. A Brownian dynamics simulation study considering normal distributions was developed in order to determine empirical correlations between the parameters that characterize these distributions. The model can also be extended to other distributions such as log-normal. This study could, therefore, facilitate the selection of membranes for industrial or scientific filtration processes once the size distribution of the feed is known and the expected characteristics in the filtrate have been defined.

  10. Characterizing uncertainty when evaluating risk management metrics: risk assessment modeling of Listeria monocytogenes contamination in ready-to-eat deli meats.

    PubMed

    Gallagher, Daniel; Ebel, Eric D; Gallagher, Owen; Labarre, David; Williams, Michael S; Golden, Neal J; Pouillot, Régis; Dearfield, Kerry L; Kause, Janell

    2013-04-01

    This report illustrates how the uncertainty about food safety metrics may influence the selection of a performance objective (PO). To accomplish this goal, we developed a model concerning Listeria monocytogenes in ready-to-eat (RTE) deli meats. This application used a second order Monte Carlo model that simulates L. monocytogenes concentrations through a series of steps: the food-processing establishment, transport, retail, the consumer's home and consumption. The model accounted for growth inhibitor use, retail cross contamination, and applied an FAO/WHO dose response model for evaluating the probability of illness. An appropriate level of protection (ALOP) risk metric was selected as the average risk of illness per serving across all consumed servings-per-annum and the model was used to solve for the corresponding performance objective (PO) risk metric as the maximum allowable L. monocytogenes concentration (cfu/g) at the processing establishment where regulatory monitoring would occur. Given uncertainty about model inputs, an uncertainty distribution of the PO was estimated. Additionally, we considered how RTE deli meats contaminated at levels above the PO would be handled by the industry using three alternative approaches. Points on the PO distribution represent the probability that - if the industry complies with a particular PO - the resulting risk-per-serving is less than or equal to the target ALOP. For example, assuming (1) a target ALOP of -6.41 log10 risk of illness per serving, (2) industry concentrations above the PO that are re-distributed throughout the remaining concentration distribution and (3) no dose response uncertainty, establishment PO's of -4.98 and -4.39 log10 cfu/g would be required for 90% and 75% confidence that the target ALOP is met, respectively. The PO concentrations from this example scenario are more stringent than the current typical monitoring level of an absence in 25 g (i.e., -1.40 log10 cfu/g) or a stricter criteria of absence in 125 g (i.e., -2.1 log10 cfu/g). This example, and others, demonstrates that a PO for L. monocytogenes would be far below any current monitoring capabilities. Furthermore, this work highlights the demands placed on risk managers and risk assessors when applying uncertain risk models to the current risk metric framework. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Exact probability distribution function for the volatility of cumulative production

    NASA Astrophysics Data System (ADS)

    Zadourian, Rubina; Klümper, Andreas

    2018-04-01

    In this paper we study the volatility and its probability distribution function for the cumulative production based on the experience curve hypothesis. This work presents a generalization of the study of volatility in Lafond et al. (2017), which addressed the effects of normally distributed noise in the production process. Due to its wide applicability in industrial and technological activities we present here the mathematical foundation for an arbitrary distribution function of the process, which we expect will pave the future research on forecasting of the production process.

  12. Statistical characteristics of mechanical heart valve cavitation in accelerated testing.

    PubMed

    Wu, Changfu; Hwang, Ned H C; Lin, Yu-Kweng M

    2004-07-01

    Cavitation damage has been observed on mechanical heart valves (MHVs) undergoing accelerated testing. Cavitation itself can be modeled as a stochastic process, as it varies from beat to beat of the testing machine. This in-vitro study was undertaken to investigate the statistical characteristics of MHV cavitation. A 25-mm St. Jude Medical bileaflet MHV (SJM 25) was tested in an accelerated tester at various pulse rates, ranging from 300 to 1,000 bpm, with stepwise increments of 100 bpm. A miniature pressure transducer was placed near a leaflet tip on the inflow side of the valve, to monitor regional transient pressure fluctuations at instants of valve closure. The pressure trace associated with each beat was passed through a 70 kHz high-pass digital filter to extract the high-frequency oscillation (HFO) components resulting from the collapse of cavitation bubbles. Three intensity-related measures were calculated for each HFO burst: its time span; its local root-mean-square (LRMS) value; and the area enveloped by the absolute value of the HFO pressure trace and the time axis, referred to as cavitation impulse. These were treated as stochastic processes, of which the first-order probability density functions (PDFs) were estimated for each test rate. Both the LRMS value and cavitation impulse were log-normal distributed, and the time span was normal distributed. These distribution laws were consistent at different test rates. The present investigation was directed at understanding MHV cavitation as a stochastic process. The results provide a basis for establishing further the statistical relationship between cavitation intensity and time-evolving cavitation damage on MHV surfaces. These data are required to assess and compare the performance of MHVs of different designs.

  13. Singing rate and detection probability: an example from the least Bell's Vireo (Vireo belli pusillus)

    Treesearch

    Thomas A. Scott; Pey-Yi Lee; Gregory C. Greene; D. Archibald McCallum

    2005-01-01

    We used 4-hr tape recordings to assess singing activity of a Least Bell's Vireo (Vireo belli pusillus) on 7 d distributed throughout the breeding season of 2001. We logged 9873 songs, for a mean singing rate of 5.84 min-1, but despite this level of activity, the bird was silent during 33 percent of the 1691 minutes of the...

  14. Distribution of leached radioactive material in the Legin Group Area, San Miguel County, Colorado

    USGS Publications Warehouse

    Rogers, Allen S.

    1950-01-01

    Radioactivity anomalies, which are small in magnitude, and probably are not caused by extensions of known uranium-vanadium ore bodies, were detected during the gamma-ray logging of diamond-drill holes in the Legin group of claims, southwest San Miguel County, Colo. The positions of these anomalies are at the top surfaces of mudstone strata within, and at the base of, the ore-bearing sandstone of the Salt Wash member of the Morrison formation. The distribution of these anomalies suggests that ground water has leached radioactive material from the ore bodies and has carried it down dip and laterally along the top surfaces of underlying impermeable mudstone strata for distance as great as 300 feet. The anomalies are probably caused by radon and its daughter elements. Preliminary tests indicate that radon in quantities up to 10-7 curies per liter may be present in ground water flowing along sandstone-mudstone contacts under carnotite ore bodies. In comparison, the radium content of the same water is less than 10-10 curies per liter. Further substantiation of the relationship between ore bodies, the movement of water, and the radon-caused anomalies may greatly increase the scope of gamma-ray logs of drill holes as an aid to prospecting.

  15. Investigation into the Use of Normal and Half-Normal Plots for Interpreting Results from Screening Experiments.

    DTIC Science & Technology

    1987-03-25

    by Lloyd (1952) using generalized least squares instead of ordinary least squares, and by Wilk, % 20 Gnanadesikan , and Freeny (1963) using a maximum...plot. The half-normal distribution is a special case of the gamma distribution proposed by Wilk, Gnanadesikan , and Huyett (1962). VARIATIONS ON THE... Gnanadesikan , R. Probability plotting methods for the analysis of data. Biometrika, 1968, 55, 1-17. This paper describes and discusses graphical techniques

  16. A Gaussian Model-Based Probabilistic Approach for Pulse Transit Time Estimation.

    PubMed

    Jang, Dae-Geun; Park, Seung-Hun; Hahn, Minsoo

    2016-01-01

    In this paper, we propose a new probabilistic approach to pulse transit time (PTT) estimation using a Gaussian distribution model. It is motivated basically by the hypothesis that PTTs normalized by RR intervals follow the Gaussian distribution. To verify the hypothesis, we demonstrate the effects of arterial compliance on the normalized PTTs using the Moens-Korteweg equation. Furthermore, we observe a Gaussian distribution of the normalized PTTs on real data. In order to estimate the PTT using the hypothesis, we first assumed that R-waves in the electrocardiogram (ECG) can be correctly identified. The R-waves limit searching ranges to detect pulse peaks in the photoplethysmogram (PPG) and to synchronize the results with cardiac beats--i.e., the peaks of the PPG are extracted within the corresponding RR interval of the ECG as pulse peak candidates. Their probabilities of being the actual pulse peak are then calculated using a Gaussian probability function. The parameters of the Gaussian function are automatically updated when a new pulse peak is identified. This update makes the probability function adaptive to variations of cardiac cycles. Finally, the pulse peak is identified as the candidate with the highest probability. The proposed approach is tested on a database where ECG and PPG waveforms are collected simultaneously during the submaximal bicycle ergometer exercise test. The results are promising, suggesting that the method provides a simple but more accurate PTT estimation in real applications.

  17. Chaotic advection at large Péclet number: Electromagnetically driven experiments, numerical simulations, and theoretical predictions

    NASA Astrophysics Data System (ADS)

    Figueroa, Aldo; Meunier, Patrice; Cuevas, Sergio; Villermaux, Emmanuel; Ramos, Eduardo

    2014-01-01

    We present a combination of experiment, theory, and modelling on laminar mixing at large Péclet number. The flow is produced by oscillating electromagnetic forces in a thin electrolytic fluid layer, leading to oscillating dipoles, quadrupoles, octopoles, and disordered flows. The numerical simulations are based on the Diffusive Strip Method (DSM) which was recently introduced (P. Meunier and E. Villermaux, "The diffusive strip method for scalar mixing in two-dimensions," J. Fluid Mech. 662, 134-172 (2010)) to solve the advection-diffusion problem by combining Lagrangian techniques and theoretical modelling of the diffusion. Numerical simulations obtained with the DSM are in reasonable agreement with quantitative dye visualization experiments of the scalar fields. A theoretical model based on log-normal Probability Density Functions (PDFs) of stretching factors, characteristic of homogeneous turbulence in the Batchelor regime, allows to predict the PDFs of scalar in agreement with numerical and experimental results. This model also indicates that the PDFs of scalar are asymptotically close to log-normal at late stages, except for the large concentration levels which correspond to low stretching factors.

  18. Inference with minimal Gibbs free energy in information field theory.

    PubMed

    Ensslin, Torsten A; Weig, Cornelius

    2010-11-01

    Non-linear and non-gaussian signal inference problems are difficult to tackle. Renormalization techniques permit us to construct good estimators for the posterior signal mean within information field theory (IFT), but the approximations and assumptions made are not very obvious. Here we introduce the simple concept of minimal Gibbs free energy to IFT, and show that previous renormalization results emerge naturally. They can be understood as being the gaussian approximation to the full posterior probability, which has maximal cross information with it. We derive optimized estimators for three applications, to illustrate the usage of the framework: (i) reconstruction of a log-normal signal from poissonian data with background counts and point spread function, as it is needed for gamma ray astronomy and for cosmography using photometric galaxy redshifts, (ii) inference of a gaussian signal with unknown spectrum, and (iii) inference of a poissonian log-normal signal with unknown spectrum, the combination of (i) and (ii). Finally we explain how gaussian knowledge states constructed by the minimal Gibbs free energy principle at different temperatures can be combined into a more accurate surrogate of the non-gaussian posterior.

  19. On the inequivalence of the CH and CHSH inequalities due to finite statistics

    NASA Astrophysics Data System (ADS)

    Renou, M. O.; Rosset, D.; Martin, A.; Gisin, N.

    2017-06-01

    Different variants of a Bell inequality, such as CHSH and CH, are known to be equivalent when evaluated on nonsignaling outcome probability distributions. However, in experimental setups, the outcome probability distributions are estimated using a finite number of samples. Therefore the nonsignaling conditions are only approximately satisfied and the robustness of the violation depends on the chosen inequality variant. We explain that phenomenon using the decomposition of the space of outcome probability distributions under the action of the symmetry group of the scenario, and propose a method to optimize the statistical robustness of a Bell inequality. In the process, we describe the finite group composed of relabeling of parties, measurement settings and outcomes, and identify correspondences between the irreducible representations of this group and properties of outcome probability distributions such as normalization, signaling or having uniform marginals.

  20. Power laws in citation distributions: evidence from Scopus.

    PubMed

    Brzezinski, Michal

    Modeling distributions of citations to scientific papers is crucial for understanding how science develops. However, there is a considerable empirical controversy on which statistical model fits the citation distributions best. This paper is concerned with rigorous empirical detection of power-law behaviour in the distribution of citations received by the most highly cited scientific papers. We have used a large, novel data set on citations to scientific papers published between 1998 and 2002 drawn from Scopus. The power-law model is compared with a number of alternative models using a likelihood ratio test. We have found that the power-law hypothesis is rejected for around half of the Scopus fields of science. For these fields of science, the Yule, power-law with exponential cut-off and log-normal distributions seem to fit the data better than the pure power-law model. On the other hand, when the power-law hypothesis is not rejected, it is usually empirically indistinguishable from most of the alternative models. The pure power-law model seems to be the best model only for the most highly cited papers in "Physics and Astronomy". Overall, our results seem to support theories implying that the most highly cited scientific papers follow the Yule, power-law with exponential cut-off or log-normal distribution. Our findings suggest also that power laws in citation distributions, when present, account only for a very small fraction of the published papers (less than 1 % for most of science fields) and that the power-law scaling parameter (exponent) is substantially higher (from around 3.2 to around 4.7) than found in the older literature.

  1. The Relationship Between Fusion, Suppression, and Diplopia in Normal and Amblyopic Vision.

    PubMed

    Spiegel, Daniel P; Baldwin, Alex S; Hess, Robert F

    2016-10-01

    Single vision occurs through a combination of fusion and suppression. When neither mechanism takes place, we experience diplopia. Under normal viewing conditions, the perceptual state depends on the spatial scale and interocular disparity. The purpose of this study was to examine the three perceptual states in human participants with normal and amblyopic vision. Participants viewed two dichoptically separated horizontal blurred edges with an opposite tilt (2.35°) and indicated their binocular percept: "one flat edge," "one tilted edge," or "two edges." The edges varied with scale (fine 4 min arc and coarse 32 min arc), disparity, and interocular contrast. We investigated how the binocular interactions vary in amblyopic (visual acuity [VA] > 0.2 logMAR, n = 4) and normal vision (VA ≤ 0 logMAR, n = 4) under interocular variations in stimulus contrast and luminance. In amblyopia, despite the established sensory dominance of the fellow eye, fusion prevails at the coarse scale and small disparities (75%). We also show that increasing the relative contrast to the amblyopic eye enhances the probability of fusion at the fine scale (from 18% to 38%), and leads to a reversal of the sensory dominance at coarse scale. In normal vision we found that interocular luminance imbalances disturbed binocular combination only at the fine scale in a way similar to that seen in amblyopia. Our results build upon the growing evidence that the amblyopic visual system is binocular and further show that the suppressive mechanisms rendering the amblyopic system functionally monocular are scale dependent.

  2. Fishnet statistics for probabilistic strength and scaling of nacreous imbricated lamellar materials

    NASA Astrophysics Data System (ADS)

    Luo, Wen; Bažant, Zdeněk P.

    2017-12-01

    Similar to nacre (or brick masonry), imbricated (or staggered) lamellar structures are widely found in nature and man-made materials, and are of interest for biomimetics. They can achieve high defect insensitivity and fracture toughness, as demonstrated in previous studies. But the probability distribution with a realistic far-left tail is apparently unknown. Here, strictly for statistical purposes, the microstructure of nacre is approximated by a diagonally pulled fishnet with quasibrittle links representing the shear bonds between parallel lamellae (or platelets). The probability distribution of fishnet strength is calculated as a sum of a rapidly convergent series of the failure probabilities after the rupture of one, two, three, etc., links. Each of them represents a combination of joint probabilities and of additive probabilities of disjoint events, modified near the zone of failed links by the stress redistributions caused by previously failed links. Based on previous nano- and multi-scale studies at Northwestern, the strength distribution of each link, characterizing the interlamellar shear bond, is assumed to be a Gauss-Weibull graft, but with a deeper Weibull tail than in Type 1 failure of non-imbricated quasibrittle materials. The autocorrelation length is considered equal to the link length. The size of the zone of failed links at maximum load increases with the coefficient of variation (CoV) of link strength, and also with fishnet size. With an increasing width-to-length aspect ratio, a rectangular fishnet gradually transits from the weakest-link chain to the fiber bundle, as the limit cases. The fishnet strength at failure probability 10-6 grows with the width-to-length ratio. For a square fishnet boundary, the strength at 10-6 failure probability is about 11% higher, while at fixed load the failure probability is about 25-times higher than it is for the non-imbricated case. This is a major safety advantage of the fishnet architecture over particulate or fiber reinforced materials. There is also a strong size effect, partly similar to that of Type 1 while the curves of log-strength versus log-size for different sizes could cross each other. The predicted behavior is verified by about a million Monte Carlo simulations for each of many fishnet geometries, sizes and CoVs of link strength. In addition to the weakest-link or fiber bundle, the fishnet becomes the third analytically tractable statistical model of structural strength, and has the former two as limit cases.

  3. Inclusion probability for DNA mixtures is a subjective one-sided match statistic unrelated to identification information

    PubMed Central

    Perlin, Mark William

    2015-01-01

    Background: DNA mixtures of two or more people are a common type of forensic crime scene evidence. A match statistic that connects the evidence to a criminal defendant is usually needed for court. Jurors rely on this strength of match to help decide guilt or innocence. However, the reliability of unsophisticated match statistics for DNA mixtures has been questioned. Materials and Methods: The most prevalent match statistic for DNA mixtures is the combined probability of inclusion (CPI), used by crime labs for over 15 years. When testing 13 short tandem repeat (STR) genetic loci, the CPI-1 value is typically around a million, regardless of DNA mixture composition. However, actual identification information, as measured by a likelihood ratio (LR), spans a much broader range. This study examined probability of inclusion (PI) mixture statistics for 517 locus experiments drawn from 16 reported cases and compared them with LR locus information calculated independently on the same data. The log(PI-1) values were examined and compared with corresponding log(LR) values. Results: The LR and CPI methods were compared in case examples of false inclusion, false exclusion, a homicide, and criminal justice outcomes. Statistical analysis of crime laboratory STR data shows that inclusion match statistics exhibit a truncated normal distribution having zero center, with little correlation to actual identification information. By the law of large numbers (LLN), CPI-1 increases with the number of tested genetic loci, regardless of DNA mixture composition or match information. These statistical findings explain why CPI is relatively constant, with implications for DNA policy, criminal justice, cost of crime, and crime prevention. Conclusions: Forensic crime laboratories have generated CPI statistics on hundreds of thousands of DNA mixture evidence items. However, this commonly used match statistic behaves like a random generator of inclusionary values, following the LLN rather than measuring identification information. A quantitative CPI number adds little meaningful information beyond the analyst's initial qualitative assessment that a person's DNA is included in a mixture. Statistical methods for reporting on DNA mixture evidence should be scientifically validated before they are relied upon by criminal justice. PMID:26605124

  4. Inclusion probability for DNA mixtures is a subjective one-sided match statistic unrelated to identification information.

    PubMed

    Perlin, Mark William

    2015-01-01

    DNA mixtures of two or more people are a common type of forensic crime scene evidence. A match statistic that connects the evidence to a criminal defendant is usually needed for court. Jurors rely on this strength of match to help decide guilt or innocence. However, the reliability of unsophisticated match statistics for DNA mixtures has been questioned. The most prevalent match statistic for DNA mixtures is the combined probability of inclusion (CPI), used by crime labs for over 15 years. When testing 13 short tandem repeat (STR) genetic loci, the CPI(-1) value is typically around a million, regardless of DNA mixture composition. However, actual identification information, as measured by a likelihood ratio (LR), spans a much broader range. This study examined probability of inclusion (PI) mixture statistics for 517 locus experiments drawn from 16 reported cases and compared them with LR locus information calculated independently on the same data. The log(PI(-1)) values were examined and compared with corresponding log(LR) values. The LR and CPI methods were compared in case examples of false inclusion, false exclusion, a homicide, and criminal justice outcomes. Statistical analysis of crime laboratory STR data shows that inclusion match statistics exhibit a truncated normal distribution having zero center, with little correlation to actual identification information. By the law of large numbers (LLN), CPI(-1) increases with the number of tested genetic loci, regardless of DNA mixture composition or match information. These statistical findings explain why CPI is relatively constant, with implications for DNA policy, criminal justice, cost of crime, and crime prevention. Forensic crime laboratories have generated CPI statistics on hundreds of thousands of DNA mixture evidence items. However, this commonly used match statistic behaves like a random generator of inclusionary values, following the LLN rather than measuring identification information. A quantitative CPI number adds little meaningful information beyond the analyst's initial qualitative assessment that a person's DNA is included in a mixture. Statistical methods for reporting on DNA mixture evidence should be scientifically validated before they are relied upon by criminal justice.

  5. Postfragmentation density function for bacterial aggregates in laminar flow

    PubMed Central

    Byrne, Erin; Dzul, Steve; Solomon, Michael; Younger, John

    2014-01-01

    The postfragmentation probability density of daughter flocs is one of the least well-understood aspects of modeling flocculation. We use three-dimensional positional data of Klebsiella pneumoniae bacterial flocs in suspension and the knowledge of hydrodynamic properties of a laminar flow field to construct a probability density function of floc volumes after a fragmentation event. We provide computational results which predict that the primary fragmentation mechanism for large flocs is erosion. The postfragmentation probability density function has a strong dependence on the size of the original floc and indicates that most fragmentation events result in clumps of one to three bacteria eroding from the original floc. We also provide numerical evidence that exhaustive fragmentation yields a limiting density inconsistent with the log-normal density predicted in the literature, most likely due to the heterogeneous nature of K. pneumoniae flocs. To support our conclusions, artificial flocs were generated and display similar postfragmentation density and exhaustive fragmentation. PMID:21599205

  6. A short note on the maximal point-biserial correlation under non-normality.

    PubMed

    Cheng, Ying; Liu, Haiyan

    2016-11-01

    The aim of this paper is to derive the maximal point-biserial correlation under non-normality. Several widely used non-normal distributions are considered, namely the uniform distribution, t-distribution, exponential distribution, and a mixture of two normal distributions. Results show that the maximal point-biserial correlation, depending on the non-normal continuous variable underlying the binary manifest variable, may not be a function of p (the probability that the dichotomous variable takes the value 1), can be symmetric or non-symmetric around p = .5, and may still lie in the range from -1.0 to 1.0. Therefore researchers should exercise caution when they interpret their sample point-biserial correlation coefficients based on popular beliefs that the maximal point-biserial correlation is always smaller than 1, and that the size of the correlation is always further restricted as p deviates from .5. © 2016 The British Psychological Society.

  7. Improved grading system for structural logs for log homes

    Treesearch

    D.W. Green; T.M. Gorman; J.W. Evans; J.F. Murphy

    2004-01-01

    Current grading standards for logs used in log home construction use visual criteria to sort logs into either “wall logs” or structural logs (round and sawn round timbers). The conservative nature of this grading system, and the grouping of stronger and weaker species for marketing purposes, probably results in the specification of logs with larger diameter than would...

  8. Grain coarsening in two-dimensional phase-field models with an orientation field

    NASA Astrophysics Data System (ADS)

    Korbuly, Bálint; Pusztai, Tamás; Henry, Hervé; Plapp, Mathis; Apel, Markus; Gránásy, László

    2017-05-01

    In the literature, contradictory results have been published regarding the form of the limiting (long-time) grain size distribution (LGSD) that characterizes the late stage grain coarsening in two-dimensional and quasi-two-dimensional polycrystalline systems. While experiments and the phase-field crystal (PFC) model (a simple dynamical density functional theory) indicate a log-normal distribution, other works including theoretical studies based on conventional phase-field simulations that rely on coarse grained fields, like the multi-phase-field (MPF) and orientation field (OF) models, yield significantly different distributions. In a recent work, we have shown that the coarse grained phase-field models (whether MPF or OF) yield very similar limiting size distributions that seem to differ from the theoretical predictions. Herein, we revisit this problem, and demonstrate in the case of OF models [R. Kobayashi, J. A. Warren, and W. C. Carter, Physica D 140, 141 (2000), 10.1016/S0167-2789(00)00023-3; H. Henry, J. Mellenthin, and M. Plapp, Phys. Rev. B 86, 054117 (2012), 10.1103/PhysRevB.86.054117] that an insufficient resolution of the small angle grain boundaries leads to a log-normal distribution close to those seen in the experiments and the molecular scale PFC simulations. Our paper indicates, furthermore, that the LGSD is critically sensitive to the details of the evaluation process, and raises the possibility that the differences among the LGSD results from different sources may originate from differences in the detection of small angle grain boundaries.

  9. A Maximum Likelihood Ensemble Data Assimilation Method Tailored to the Inner Radiation Belt

    NASA Astrophysics Data System (ADS)

    Guild, T. B.; O'Brien, T. P., III; Mazur, J. E.

    2014-12-01

    The Earth's radiation belts are composed of energetic protons and electrons whose fluxes span many orders of magnitude, whose distributions are log-normal, and where data-model differences can be large and also log-normal. This physical system thus challenges standard data assimilation methods relying on underlying assumptions of Gaussian distributions of measurements and data-model differences, where innovations to the model are small. We have therefore developed a data assimilation method tailored to these properties of the inner radiation belt, analogous to the ensemble Kalman filter but for the unique cases of non-Gaussian model and measurement errors, and non-linear model and measurement distributions. We apply this method to the inner radiation belt proton populations, using the SIZM inner belt model [Selesnick et al., 2007] and SAMPEX/PET and HEO proton observations to select the most likely ensemble members contributing to the state of the inner belt. We will describe the algorithm, the method of generating ensemble members, our choice of minimizing the difference between instrument counts not phase space densities, and demonstrate the method with our reanalysis of the inner radiation belt throughout solar cycle 23. We will report on progress to continue our assimilation into solar cycle 24 using the Van Allen Probes/RPS observations.

  10. LEVELS OF EXTREMELY LOW-FREQUENCY ELECTRIC AND MAGNETIC FIELDS FROM OVERHEAD POWER LINES IN THE OUTDOOR ENVIRONMENT OF RAMALLAH CITY-PALESTINE.

    PubMed

    Abuasbi, Falastine; Lahham, Adnan; Abdel-Raziq, Issam Rashid

    2018-05-01

    In this study, levels of extremely low-frequency electric and magnetic fields originated from overhead power lines were investigated in the outdoor environment in Ramallah city, Palestine. Spot measurements were applied to record fields intensities over 6-min period. The Spectrum Analyzer NF-5035 was used to perform measurements at 1 m above ground level and directly underneath 40 randomly selected power lines distributed fairly within the city. Levels of electric fields varied depending on the line's category (power line, transformer or distributor), a minimum mean electric field of 3.9 V/m was found under a distributor line, and a maximum of 769.4 V/m under a high-voltage power line (66 kV). However, results of electric fields showed a log-normal distribution with the geometric mean and the geometric standard deviation of 35.9 and 2.8 V/m, respectively. Magnetic fields measured at power lines, on contrast, were not log-normally distributed; the minimum and maximum mean magnetic fields under power lines were 0.89 and 3.5 μT, respectively. As a result, none of the measured fields exceeded the ICNIRP's guidelines recommended for general public exposures to extremely low-frequency fields.

  11. Evaluation of Mean and Variance Integrals without Integration

    ERIC Educational Resources Information Center

    Joarder, A. H.; Omar, M. H.

    2007-01-01

    The mean and variance of some continuous distributions, in particular the exponentially decreasing probability distribution and the normal distribution, are considered. Since they involve integration by parts, many students do not feel comfortable. In this note, a technique is demonstrated for deriving mean and variance through differential…

  12. Predicting durations of online collective actions based on Peaks' heights

    NASA Astrophysics Data System (ADS)

    Lu, Peng; Nie, Shizhao; Wang, Zheng; Jing, Ziwei; Yang, Jianwu; Qi, Zhongxiang; Pujia, Wangmo

    2018-02-01

    Capturing the whole process of collective actions, the peak model contains four stages, including Prepare, Outbreak, Peak, and Vanish. Based on the peak model, one of the key variables, factors and parameters are further investigated in this paper, which is the rate between peaks and spans. Although the durations or spans and peaks' heights are highly diversified, it seems that the ratio between them is quite stable. If the rate's regularity is discovered, we can predict how long the collective action lasts and when it ends based on the peak's height. In this work, we combined mathematical simulations and empirical big data of 148 cases to explore the regularity of ratio's distribution. It is indicated by results of simulations that the rate has some regularities of distribution, which is not normal distribution. The big data has been collected from the 148 online collective actions and the whole processes of participation are recorded. The outcomes of empirical big data indicate that the rate seems to be closer to being log-normally distributed. This rule holds true for both the total cases and subgroups of 148 online collective actions. The Q-Q plot is applied to check the normal distribution of the rate's logarithm, and the rate's logarithm does follow the normal distribution.

  13. Technology-Enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution

    ERIC Educational Resources Information Center

    Dinov, Ivo D.; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2013-01-01

    Data analysis requires subtle probability reasoning to answer questions like "What is the chance of event A occurring, given that event B was observed?" This generic question arises in discussions of many intriguing scientific questions such as "What is the probability that an adolescent weighs between 120 and 140 pounds given that…

  14. Line-driven disc wind model for ultrafast outflows in active galactic nuclei - scaling with luminosity

    NASA Astrophysics Data System (ADS)

    Nomura, M.; Ohsuga, K.

    2017-03-01

    In order to reveal the origin of the ultrafast outflows (UFOs) that are frequently observed in active galactic nuclei (AGNs), we perform two-dimensional radiation hydrodynamics simulations of the line-driven disc winds, which are accelerated by the radiation force due to the spectral lines. The line-driven winds are successfully launched for the range of MBH = 106-9 M⊙ and ε = 0.1-0.5, and the resulting mass outflow rate (dot{M_w}), momentum flux (dot{p_w}), and kinetic luminosity (dot{E_w}) are in the region containing 90 per cent of the posterior probability distribution in the dot{M}_w-Lbol plane, dot{p}_w-Lbol plane, and dot{E}_w-Lbol plane shown in Gofford et al., where MBH is the black hole mass, ε is the Eddington ratio, and Lbol is the bolometric luminosity. The best-fitting relations in Gofford et al., d log dot{M_w}/d log {L_bol}˜ 0.9, d log dot{p_w}/d log {L_bol}˜ 1.2, and d log dot{E_w}/d log {L_bol}˜ 1.5, are roughly consistent with our results, d log dot{M_w}/d log {L_bol}˜ 9/8, d log dot{p_w}/d log {L_bol}˜ 10/8, and d log dot{E_w}/d log {L_bol}˜ 11/8. In addition, our model predicts that no UFO features are detected for the AGNs with ε ≲ 0.01, since the winds do not appear. Also, only AGNs with MBH ≲ 108 M⊙ exhibit the UFOs when ε ∼ 0.025. These predictions nicely agree with the X-ray observations. These results support that the line-driven disc wind is the origin of the UFOs.

  15. IMPROVED V II log(gf) VALUES, HYPERFINE STRUCTURE CONSTANTS, AND ABUNDANCE DETERMINATIONS IN THE PHOTOSPHERES OF THE SUN AND METAL-POOR STAR HD 84937

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wood, M. P.; Lawler, J. E.; Den Hartog, E. A.

    2014-10-01

    New experimental absolute atomic transition probabilities are reported for 203 lines of V II. Branching fractions are measured from spectra recorded using a Fourier transform spectrometer and an echelle spectrometer. The branching fractions are normalized with radiative lifetime measurements to determine the new transition probabilities. Generally good agreement is found between this work and previously reported V II transition probabilities. Two spectrometers, independent radiometric calibration methods, and independent data analysis routines enable a reduction in systematic uncertainties, in particular those due to optical depth errors. In addition, new hyperfine structure constants are measured for selected levels by least squares fittingmore » line profiles in the FTS spectra. The new V II data are applied to high resolution visible and UV spectra of the Sun and metal-poor star HD 84937 to determine new, more accurate V abundances. Lines covering a range of wavelength and excitation potential are used to search for non-LTE effects. Very good agreement is found between our new solar photospheric V abundance, log ε(V) = 3.95 from 15 V II lines, and the solar-system meteoritic value. In HD 84937, we derive [V/H] = –2.08 from 68 lines, leading to a value of [V/Fe] = 0.24.« less

  16. flexsurv: A Platform for Parametric Survival Modeling in R

    PubMed Central

    Jackson, Christopher H.

    2018-01-01

    flexsurv is an R package for fully-parametric modeling of survival data. Any parametric time-to-event distribution may be fitted if the user supplies a probability density or hazard function, and ideally also their cumulative versions. Standard survival distributions are built in, including the three and four-parameter generalized gamma and F distributions. Any parameter of any distribution can be modeled as a linear or log-linear function of covariates. The package also includes the spline model of Royston and Parmar (2002), in which both baseline survival and covariate effects can be arbitrarily flexible parametric functions of time. The main model-fitting function, flexsurvreg, uses the familiar syntax of survreg from the standard survival package (Therneau 2016). Censoring or left-truncation are specified in ‘Surv’ objects. The models are fitted by maximizing the full log-likelihood, and estimates and confidence intervals for any function of the model parameters can be printed or plotted. flexsurv also provides functions for fitting and predicting from fully-parametric multi-state models, and connects with the mstate package (de Wreede, Fiocco, and Putter 2011). This article explains the methods and design principles of the package, giving several worked examples of its use. PMID:29593450

  17. Analyses of flood-flow frequency for selected gaging stations in South Dakota

    USGS Publications Warehouse

    Benson, R.D.; Hoffman, E.B.; Wipf, V.J.

    1985-01-01

    Analyses of flood flow frequency were made for 111 continuous-record gaging stations in South Dakota with 10 or more years of record. The analyses were developed using the log-Pearson Type III procedure recommended by the U.S. Water Resources Council. The procedure characterizes flood occurrence at a single site as a sequence of annual peak flows. The magnitudes of the annual peak flows are assumed to be independent random variables following a log-Pearson Type III probability distribution, which defines the probability that any single annual peak flow will exceed a specified discharge. By considering only annual peak flows, the flood-frequency analysis becomes the estimation of the log-Pearson annual-probability curve using the record of annual peak flows at the site. The recorded data are divided into two classes: systematic and historic. The systematic record includes all annual peak flows determined in the process of conducting a systematic gaging program at a site. In this program, the annual peak flow is determined for each and every year of the program. The systematic record is intended to constitute an unbiased and representative sample of the population of all possible annual peak flows at the site. In contrast to the systematic record, the historic record consists of annual peak flows that would not have been determined except for evidence indicating their unusual magnitude. Flood information acquired from historical sources almost invariably refers to floods of noteworthy, and hence extraordinary, size. Although historic records form a biased and unrepresentative sample, they can be used to supplement the systematic record. (Author 's abstract)

  18. Reliability of vapor-grown planar In/sub 0. 53/Ga/sub 0. 47/As/InP p-i-n photodiodes with very high failure activation energy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Forrest, S.R.; Ban, V.S.; Gasparian, G.

    1988-05-01

    The authors measured the mean time to failure (MTTF) for a statistically significant population of planar In/sub 0.53/Ga/sub 0.47/As/InP heterostructure p-i-n photodetectors at several elevated temperatures. The probability for failure is fit to a log-normal distribution, with the result that the width of the failure distribution is sigma = 0.55 +- 0.2, and is roughly independent of temperature. From the temperature dependence of the MTFF data, they find that the failure mechanism is thermally activated, with an activation energy of 1.5 +- 0.2 eV measured in the temperature range of 170 - 250/sup 0/C. This extrapolates to a MTTF ofmore » less than 0.1 failure in 10/sup 9/ h (or < 0.1 FIT) at 70/sup 0/C, indicating that such devices are useful for systems requiring extremely high reliable components, even if operated at elevated temperatures for significant time periods. To the authors' knowledge, this activation energy is the highest value reported for In/sub 0.53/Ga/sub 0.47/As/InP photodetectors, and is significantly higher than the energies of -- 0.85 eV often suspected to these devices.« less

  19. Measurement of the distribution of ventilation-perfusion ratios in the human lung with proton MRI: comparison with the multiple inert-gas elimination technique.

    PubMed

    Sá, Rui Carlos; Henderson, A Cortney; Simonson, Tatum; Arai, Tatsuya J; Wagner, Harrieth; Theilmann, Rebecca J; Wagner, Peter D; Prisk, G Kim; Hopkins, Susan R

    2017-07-01

    We have developed a novel functional proton magnetic resonance imaging (MRI) technique to measure regional ventilation-perfusion (V̇ A /Q̇) ratio in the lung. We conducted a comparison study of this technique in healthy subjects ( n = 7, age = 42 ± 16 yr, Forced expiratory volume in 1 s = 94% predicted), by comparing data measured using MRI to that obtained from the multiple inert gas elimination technique (MIGET). Regional ventilation measured in a sagittal lung slice using Specific Ventilation Imaging was combined with proton density measured using a fast gradient-echo sequence to calculate regional alveolar ventilation, registered with perfusion images acquired using arterial spin labeling, and divided on a voxel-by-voxel basis to obtain regional V̇ A /Q̇ ratio. LogSDV̇ and LogSDQ̇, measures of heterogeneity derived from the standard deviation (log scale) of the ventilation and perfusion vs. V̇ A /Q̇ ratio histograms respectively, were calculated. On a separate day, subjects underwent study with MIGET and LogSDV̇ and LogSDQ̇ were calculated from MIGET data using the 50-compartment model. MIGET LogSDV̇ and LogSDQ̇ were normal in all subjects. LogSDQ̇ was highly correlated between MRI and MIGET (R = 0.89, P = 0.007); the intercept was not significantly different from zero (-0.062, P = 0.65) and the slope did not significantly differ from identity (1.29, P = 0.34). MIGET and MRI measures of LogSDV̇ were well correlated (R = 0.83, P = 0.02); the intercept differed from zero (0.20, P = 0.04) and the slope deviated from the line of identity (0.52, P = 0.01). We conclude that in normal subjects, there is a reasonable agreement between MIGET measures of heterogeneity and those from proton MRI measured in a single slice of lung. NEW & NOTEWORTHY We report a comparison of a new proton MRI technique to measure regional V̇ A /Q̇ ratio against the multiple inert gas elimination technique (MIGET). The study reports good relationships between measures of heterogeneity derived from MIGET and those derived from MRI. Although currently limited to a single slice acquisition, these data suggest that single sagittal slice measures of V̇ A /Q̇ ratio provide an adequate means to assess heterogeneity in the normal lung. Copyright © 2017 the American Physiological Society.

  20. Reconstructing missing information on precipitation datasets: impact of tails on adopted statistical distributions.

    NASA Astrophysics Data System (ADS)

    Pedretti, Daniele; Beckie, Roger Daniel

    2014-05-01

    Missing data in hydrological time-series databases are ubiquitous in practical applications, yet it is of fundamental importance to make educated decisions in problems involving exhaustive time-series knowledge. This includes precipitation datasets, since recording or human failures can produce gaps in these time series. For some applications, directly involving the ratio between precipitation and some other quantity, lack of complete information can result in poor understanding of basic physical and chemical dynamics involving precipitated water. For instance, the ratio between precipitation (recharge) and outflow rates at a discharge point of an aquifer (e.g. rivers, pumping wells, lysimeters) can be used to obtain aquifer parameters and thus to constrain model-based predictions. We tested a suite of methodologies to reconstruct missing information in rainfall datasets. The goal was to obtain a suitable and versatile method to reduce the errors given by the lack of data in specific time windows. Our analyses included both a classical chronologically-pairing approach between rainfall stations and a probability-based approached, which accounted for the probability of exceedence of rain depths measured at two or multiple stations. Our analyses proved that it is not clear a priori which method delivers the best methodology. Rather, this selection should be based considering the specific statistical properties of the rainfall dataset. In this presentation, our emphasis is to discuss the effects of a few typical parametric distributions used to model the behavior of rainfall. Specifically, we analyzed the role of distributional "tails", which have an important control on the occurrence of extreme rainfall events. The latter strongly affect several hydrological applications, including recharge-discharge relationships. The heavy-tailed distributions we considered were parametric Log-Normal, Generalized Pareto, Generalized Extreme and Gamma distributions. The methods were first tested on synthetic examples, to have a complete control of the impact of several variables such as minimum amount of data required to obtain reliable statistical distributions from the selected parametric functions. Then, we applied the methodology to precipitation datasets collected in the Vancouver area and on a mining site in Peru.

  1. A model for the spatial distribution of snow water equivalent parameterized from the spatial variability of precipitation

    NASA Astrophysics Data System (ADS)

    Skaugen, Thomas; Weltzien, Ingunn H.

    2016-09-01

    Snow is an important and complicated element in hydrological modelling. The traditional catchment hydrological model with its many free calibration parameters, also in snow sub-models, is not a well-suited tool for predicting conditions for which it has not been calibrated. Such conditions include prediction in ungauged basins and assessing hydrological effects of climate change. In this study, a new model for the spatial distribution of snow water equivalent (SWE), parameterized solely from observed spatial variability of precipitation, is compared with the current snow distribution model used in the operational flood forecasting models in Norway. The former model uses a dynamic gamma distribution and is called Snow Distribution_Gamma, (SD_G), whereas the latter model has a fixed, calibrated coefficient of variation, which parameterizes a log-normal model for snow distribution and is called Snow Distribution_Log-Normal (SD_LN). The two models are implemented in the parameter parsimonious rainfall-runoff model Distance Distribution Dynamics (DDD), and their capability for predicting runoff, SWE and snow-covered area (SCA) is tested and compared for 71 Norwegian catchments. The calibration period is 1985-2000 and validation period is 2000-2014. Results show that SDG better simulates SCA when compared with MODIS satellite-derived snow cover. In addition, SWE is simulated more realistically in that seasonal snow is melted out and the building up of "snow towers" and giving spurious positive trends in SWE, typical for SD_LN, is prevented. The precision of runoff simulations using SDG is slightly inferior, with a reduction in Nash-Sutcliffe and Kling-Gupta efficiency criterion of 0.01, but it is shown that the high precision in runoff prediction using SD_LN is accompanied with erroneous simulations of SWE.

  2. On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices

    PubMed Central

    Ye, Xin; Pendyala, Ram M.; Zou, Yajie

    2017-01-01

    A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences. PMID:29073152

  3. On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices.

    PubMed

    Wang, Ke; Ye, Xin; Pendyala, Ram M; Zou, Yajie

    2017-01-01

    A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences.

  4. Evaluation of geophysical logs and video surveys in boreholes adjacent to the Berkley Products Superfund Site, West Cocalico Township, Lancaster County, Pennsylvania

    USGS Publications Warehouse

    Low, Dennis J.; Conger, Randall W.

    1998-01-01

    Between February 1998 and April 1998, geophysical logs were collected in nine boreholes adjacent to the Berkley Products Superfund Site, West Cocalico Township, Lancaster County, Pa. Video surveys were conducted on four of the nine boreholes. The boreholes range in depth from 320 to 508 feet below land surface, are completed open holes, have ambient vertical flow of water, and penetrate a series of interbedded siltstone, sandstone, and conglomerate units. The purpose of collecting geophysical-log data was to help determine horizontal and vertical distribution of contaminated ground water migrating from known or suspected sources and to aid in the placement of permanent borehole packers. The primary contaminants were derived from paint waste that included pigment sludges and wash solvents. The chlorinated volatile organic compounds probably originated from the wash solvents.Caliper logs and video surveys were used to locate fractures; inflections on fluid-resistivity and fluid-temperature logs were used to locate possible water-bearing fractures. Heatpulse-flowmeter measurements were used to verify the locations of water-producing or water-receiving zones and to measure rates of flow between water-bearing fractures. Single-point-resistance and natural-gamma logs provided information on stratigraphy. After interpretation of geophysical logs, video surveys, and driller's logs, permanent multiple-packer systems were installed in each borehole to obtain depth specific water samples from one or more water-bearing fractures in each borehole.

  5. On the Use of the Log-Normal Particle Size Distribution to Characterize Global Rain

    NASA Technical Reports Server (NTRS)

    Meneghini, Robert; Rincon, Rafael; Liao, Liang

    2003-01-01

    Although most parameterizations of the drop size distributions (DSD) use the gamma function, there are several advantages to the log-normal form, particularly if we want to characterize the large scale space-time variability of the DSD and rain rate. The advantages of the distribution are twofold: the logarithm of any moment can be expressed as a linear combination of the individual parameters of the distribution; the parameters of the distribution are approximately normally distributed. Since all radar and rainfall-related parameters can be written approximately as a moment of the DSD, the first property allows us to express the logarithm of any radar/rainfall variable as a linear combination of the individual DSD parameters. Another consequence is that any power law relationship between rain rate, reflectivity factor, specific attenuation or water content can be expressed in terms of the covariance matrix of the DSD parameters. The joint-normal property of the DSD parameters has applications to the description of the space-time variation of rainfall in the sense that any radar-rainfall quantity can be specified by the covariance matrix associated with the DSD parameters at two arbitrary space-time points. As such, the parameterization provides a means by which we can use the spaceborne radar-derived DSD parameters to specify in part the covariance matrices globally. However, since satellite observations have coarse temporal sampling, the specification of the temporal covariance must be derived from ancillary measurements and models. Work is presently underway to determine whether the use of instantaneous rain rate data from the TRMM Precipitation Radar can provide good estimates of the spatial correlation in rain rate from data collected in 5(sup 0)x 5(sup 0) x 1 month space-time boxes. To characterize the temporal characteristics of the DSD parameters, disdrometer data are being used from the Wallops Flight Facility site where as many as 4 disdrometers have been used to acquire data over a 2 km path. These data should help quantify the temporal form of the covariance matrix at this site.

  6. Integrating models that depend on variable data

    NASA Astrophysics Data System (ADS)

    Banks, A. T.; Hill, M. C.

    2016-12-01

    Models of human-Earth systems are often developed with the goal of predicting the behavior of one or more dependent variables from multiple independent variables, processes, and parameters. Often dependent variable values range over many orders of magnitude, which complicates evaluation of the fit of the dependent variable values to observations. Many metrics and optimization methods have been proposed to address dependent variable variability, with little consensus being achieved. In this work, we evaluate two such methods: log transformation (based on the dependent variable being log-normally distributed with a constant variance) and error-based weighting (based on a multi-normal distribution with variances that tend to increase as the dependent variable value increases). Error-based weighting has the advantage of encouraging model users to carefully consider data errors, such as measurement and epistemic errors, while log-transformations can be a black box for typical users. Placing the log-transformation into the statistical perspective of error-based weighting has not formerly been considered, to the best of our knowledge. To make the evaluation as clear and reproducible as possible, we use multiple linear regression (MLR). Simulations are conducted with MatLab. The example represents stream transport of nitrogen with up to eight independent variables. The single dependent variable in our example has values that range over 4 orders of magnitude. Results are applicable to any problem for which individual or multiple data types produce a large range of dependent variable values. For this problem, the log transformation produced good model fit, while some formulations of error-based weighting worked poorly. Results support previous suggestions fthat error-based weighting derived from a constant coefficient of variation overemphasizes low values and degrades model fit to high values. Applying larger weights to the high values is inconsistent with the log-transformation. Greater consistency is obtained by imposing smaller (by up to a factor of 1/35) weights on the smaller dependent-variable values. From an error-based perspective, the small weights are consistent with large standard deviations. This work considers the consequences of these two common ways of addressing variable data.

  7. Responses of crayfish photoreceptor cells following intense light adaptation.

    PubMed

    Cummins, D R; Goldsmith, T H

    1986-01-01

    After intense orange adapting exposures that convert 80% of the rhodopsin in the eye to metarhodopsin, rhabdoms become covered with accessory pigment and appear to lose some microvillar order. Only after a delay of hours or even days is the metarhodopsin replaced by rhodopsin (Cronin and Goldsmith 1984). After 24 h of dark adaptation, when there has been little recovery of visual pigment, the photoreceptor cells have normal resting potentials and input resistances, and the reversal potential of the light response is 10-15 mV (inside positive), unchanged from controls. The log V vs log I curve is shifted about 0.6 log units to the right on the energy axis, quantitatively consistent with the decrease in the probability of quantum catch expected from the lowered concentration of rhodopsin in the rhabdoms. Furthermore, at 24 h the photoreceptors exhibit a broader spectral sensitivity than controls, which is also expected from accumulations of metarhodopsin in the rhabdoms. In three other respects, however, the transduction process appears to be light adapted: The voltage responses are more phasic than those of control photoreceptors. The relatively larger effect (compared to controls) of low extracellular Ca++ (1 mmol/l EGTA) in potentiating the photoresponses suggests that the photoreceptors may have elevated levels of free cytoplasmic Ca++. The saturating depolarization is only about 30% as large as the maximal receptor potentials of contralateral, dark controls, and by that measure the log V-log I curve is shifted downward by 0.54 log units.(ABSTRACT TRUNCATED AT 250 WORDS)

  8. On the use of log-transformation vs. nonlinear regression for analyzing biological power laws

    USGS Publications Warehouse

    Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.

    2011-01-01

    Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.

  9. Modelling of wildlife migrations and its economic impacts

    NASA Astrophysics Data System (ADS)

    Myšková, Kateřina; Žák, Jaroslav

    2013-10-01

    Natural wildlife migrations are often disrupted by the construction of line structures, especially freeways. Various overpasses and underpasses (migration objects) are being designed to enable species to pass through line structures. The newly developed original procedure for the quantification of the utility of migration objects (migration potentials) and sections of line structures retrieves the deficiencies of the previous ones. The procedure has been developed under bulk information obtained by monitoring migrations using camera system and spy games. The log-normal distribution functions are used. The procedure for the evaluation of the probability of the permeability of the line structures sectors is also presented. The above mentioned procedures and algorithms can be used while planning, preparing, constructing or verifying the measures to assure the permeability of line structures for selected feral species. Using the procedures can significantly reduce financial costs and improve the permeability. When the migration potentials are correctly determined and the whole sector is taken into account for the migrations, some originally designed objects may be found to be redundant and not building them will bring financial savings.

  10. q-Gaussian distributions of leverage returns, first stopping times, and default risk valuations

    NASA Astrophysics Data System (ADS)

    Katz, Yuri A.; Tian, Li

    2013-10-01

    We study the probability distributions of daily leverage returns of 520 North American industrial companies that survive de-listing during the financial crisis, 2006-2012. We provide evidence that distributions of unbiased leverage returns of all individual firms belong to the class of q-Gaussian distributions with the Tsallis entropic parameter within the interval 1

  11. Parsimonious nonstationary flood frequency analysis

    NASA Astrophysics Data System (ADS)

    Serago, Jake M.; Vogel, Richard M.

    2018-02-01

    There is now widespread awareness of the impact of anthropogenic influences on extreme floods (and droughts) and thus an increasing need for methods to account for such influences when estimating a frequency distribution. We introduce a parsimonious approach to nonstationary flood frequency analysis (NFFA) based on a bivariate regression equation which describes the relationship between annual maximum floods, x, and an exogenous variable which may explain the nonstationary behavior of x. The conditional mean, variance and skewness of both x and y = ln (x) are derived, and combined with numerous common probability distributions including the lognormal, generalized extreme value and log Pearson type III models, resulting in a very simple and general approach to NFFA. Our approach offers several advantages over existing approaches including: parsimony, ease of use, graphical display, prediction intervals, and opportunities for uncertainty analysis. We introduce nonstationary probability plots and document how such plots can be used to assess the improved goodness of fit associated with a NFFA.

  12. Surveillance system and method having an adaptive sequential probability fault detection test

    NASA Technical Reports Server (NTRS)

    Herzog, James P. (Inventor); Bickford, Randall L. (Inventor)

    2005-01-01

    System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.

  13. Surveillance system and method having an adaptive sequential probability fault detection test

    NASA Technical Reports Server (NTRS)

    Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)

    2006-01-01

    System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.

  14. Surveillance System and Method having an Adaptive Sequential Probability Fault Detection Test

    NASA Technical Reports Server (NTRS)

    Bickford, Randall L. (Inventor); Herzog, James P. (Inventor)

    2008-01-01

    System and method providing surveillance of an asset such as a process and/or apparatus by providing training and surveillance procedures that numerically fit a probability density function to an observed residual error signal distribution that is correlative to normal asset operation and then utilizes the fitted probability density function in a dynamic statistical hypothesis test for providing improved asset surveillance.

  15. Bladder cancer mapping in Libya based on standardized morbidity ratio and log-normal model

    NASA Astrophysics Data System (ADS)

    Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley

    2017-05-01

    Disease mapping contains a set of statistical techniques that detail maps of rates based on estimated mortality, morbidity, and prevalence. A traditional approach to measure the relative risk of the disease is called Standardized Morbidity Ratio (SMR). It is the ratio of an observed and expected number of accounts in an area, which has the greatest uncertainty if the disease is rare or if geographical area is small. Therefore, Bayesian models or statistical smoothing based on Log-normal model are introduced which might solve SMR problem. This study estimates the relative risk for bladder cancer incidence in Libya from 2006 to 2007 based on the SMR and log-normal model, which were fitted to data using WinBUGS software. This study starts with a brief review of these models, starting with the SMR method and followed by the log-normal model, which is then applied to bladder cancer incidence in Libya. All results are compared using maps and tables. The study concludes that the log-normal model gives better relative risk estimates compared to the classical method. The log-normal model has can overcome the SMR problem when there is no observed bladder cancer in an area.

  16. Application of continuous normal-lognormal bivariate density functions in a sensitivity analysis of municipal solid waste landfill.

    PubMed

    Petrovic, Igor; Hip, Ivan; Fredlund, Murray D

    2016-09-01

    The variability of untreated municipal solid waste (MSW) shear strength parameters, namely cohesion and shear friction angle, with respect to waste stability problems, is of primary concern due to the strong heterogeneity of MSW. A large number of municipal solid waste (MSW) shear strength parameters (friction angle and cohesion) were collected from published literature and analyzed. The basic statistical analysis has shown that the central tendency of both shear strength parameters fits reasonably well within the ranges of recommended values proposed by different authors. In addition, it was established that the correlation between shear friction angle and cohesion is not strong but it still remained significant. Through use of a distribution fitting method it was found that the shear friction angle could be adjusted to a normal probability density function while cohesion follows the log-normal density function. The continuous normal-lognormal bivariate density function was therefore selected as an adequate model to ascertain rational boundary values ("confidence interval") for MSW shear strength parameters. It was concluded that a curve with a 70% confidence level generates a "confidence interval" within the reasonable limits. With respect to the decomposition stage of the waste material, three different ranges of appropriate shear strength parameters were indicated. Defined parameters were then used as input parameters for an Alternative Point Estimated Method (APEM) stability analysis on a real case scenario of the Jakusevec landfill. The Jakusevec landfill is the disposal site of the capital of Croatia - Zagreb. The analysis shows that in the case of a dry landfill the most significant factor influencing the safety factor was the shear friction angle of old, decomposed waste material, while in the case of a landfill with significant leachate level the most significant factor influencing the safety factor was the cohesion of old, decomposed waste material. The analysis also showed that a satisfactory level of performance with a small probability of failure was produced for the standard practice design of waste landfills as well as an analysis scenario immediately after the landfill closure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. RESIDENTIAL EXPOSURE TO EXTREMELY LOW FREQUENCY ELECTRIC AND MAGNETIC FIELDS IN THE CITY OF RAMALLAH-PALESTINE.

    PubMed

    Abuasbi, Falastine; Lahham, Adnan; Abdel-Raziq, Issam Rashid

    2018-04-01

    This study was focused on the measurement of residential exposure to power frequency (50-Hz) electric and magnetic fields in the city of Ramallah-Palestine. A group of 32 semi-randomly selected residences distributed amongst the city were under investigations of fields variations. Measurements were performed with the Spectrum Analyzer NF-5035 and were carried out at one meter above ground level in the residence's bedroom or living room under both zero and normal-power conditions. Fields' variations were recorded over 6-min and some times over few hours. Electric fields under normal-power use were relatively low; ~59% of residences experienced mean electric fields <10 V/m. The highest mean electric field of 66.9 V/m was found at residence R27. However, electric field values were log-normally distributed with geometric mean and geometric standard deviation of 9.6 and 3.5 V/m, respectively. Background electric fields measured under zero-power use, were very low; ~80% of residences experienced background electric fields <1 V/m. Under normal-power use, the highest mean magnetic field (0.45 μT) was found at residence R26 where an indoor power substation exists. However, ~81% of residences experienced mean magnetic fields <0.1 μT. Magnetic fields measured inside the 32 residences showed also a log-normal distribution with geometric mean and geometric standard deviation of 0.04 and 3.14 μT, respectively. Under zero-power conditions, ~7% of residences experienced average background magnetic field >0.1 μT. Fields from appliances showed a maximum mean electric field of 67.4 V/m from hair dryer, and maximum mean magnetic field of 13.7 μT from microwave oven. However, no single result surpassed the ICNIRP limits for general public exposures to ELF fields, but still, the interval 0.3-0.4 μT for possible non-thermal health impacts of exposure to ELF magnetic fields, was experienced in 13% of the residences.

  18. Transformation techniques for cross-sectional and longitudinal endocrine data: application to salivary cortisol concentrations.

    PubMed

    Miller, Robert; Plessow, Franziska

    2013-06-01

    Endocrine time series often lack normality and homoscedasticity most likely due to the non-linear dynamics of their natural determinants and the immanent characteristics of the biochemical analysis tools, respectively. As a consequence, data transformation (e.g., log-transformation) is frequently applied to enable general linear model-based analyses. However, to date, data transformation techniques substantially vary across studies and the question of which is the optimum power transformation remains to be addressed. The present report aims to provide a common solution for the analysis of endocrine time series by systematically comparing different power transformations with regard to their impact on data normality and homoscedasticity. For this, a variety of power transformations of the Box-Cox family were applied to salivary cortisol data of 309 healthy participants sampled in temporal proximity to a psychosocial stressor (the Trier Social Stress Test). Whereas our analyses show that un- as well as log-transformed data are inferior in terms of meeting normality and homoscedasticity, they also provide optimum transformations for both, cross-sectional cortisol samples reflecting the distributional concentration equilibrium and longitudinal cortisol time series comprising systematically altered hormone distributions that result from simultaneously elicited pulsatile change and continuous elimination processes. Considering these dynamics of endocrine oscillations, data transformation prior to testing GLMs seems mandatory to minimize biased results. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. Distribution of Total Depressive Symptoms Scores and Each Depressive Symptom Item in a Sample of Japanese Employees.

    PubMed

    Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Yamada, Hiroshi; Miyake, Hirotsugu; Furukawa, Toshiaki A; Furukaw, Toshiaki A

    2016-01-01

    In a previous study, we reported that the distribution of total depressive symptoms scores according to the Center for Epidemiologic Studies Depression Scale (CES-D) in a general population is stable throughout middle adulthood and follows an exponential pattern except for at the lowest end of the symptom score. Furthermore, the individual distributions of 16 negative symptom items of the CES-D exhibit a common mathematical pattern. To confirm the reproducibility of these findings, we investigated the distribution of total depressive symptoms scores and 16 negative symptom items in a sample of Japanese employees. We analyzed 7624 employees aged 20-59 years who had participated in the Northern Japan Occupational Health Promotion Centers Collaboration Study for Mental Health. Depressive symptoms were assessed using the CES-D. The CES-D contains 20 items, each of which is scored in four grades: "rarely," "some," "much," and "most of the time." The descriptive statistics and frequency curves of the distributions were then compared according to age group. The distribution of total depressive symptoms scores appeared to be stable from 30-59 years. The right tail of the distribution for ages 30-59 years exhibited a linear pattern with a log-normal scale. The distributions of the 16 individual negative symptom items of the CES-D exhibited a common mathematical pattern which displayed different distributions with a boundary at "some." The distributions of the 16 negative symptom items from "some" to "most" followed a linear pattern with a log-normal scale. The distributions of the total depressive symptoms scores and individual negative symptom items in a Japanese occupational setting show the same patterns as those observed in a general population. These results show that the specific mathematical patterns of the distributions of total depressive symptoms scores and individual negative symptom items can be reproduced in an occupational population.

  20. Characterization of airborne particles in an open pit mining region.

    PubMed

    Huertas, José I; Huertas, María E; Solís, Dora A

    2012-04-15

    We characterized airborne particle samples collected from 15 stations in operation since 2007 in one of the world's largest opencast coal mining regions. Using gravimetric, scanning electron microscopy (SEM-EDS), and X-ray photoelectron spectroscopy (XPS) analysis the samples were characterized in terms of concentration, morphology, particle size distribution (PSD), and elemental composition. All of the total suspended particulate (TSP) samples exhibited a log-normal PSD with a mean of d=5.46 ± 0.32 μm and σ(ln d)=0.61 ± 0.03. Similarly, all particles with an equivalent aerodynamic diameter less than 10 μm (PM(10)) exhibited a log-normal type distribution with a mean of d=3.6 ± 0.38 μm and σ(ln d)=0.55 ± 0.03. XPS analysis indicated that the main elements present in the particles were carbon, oxygen, potassium, and silicon with average mass concentrations of 41.5%, 34.7%, 11.6%, and 5.7% respectively. In SEM micrographs the particles appeared smooth-surfaced and irregular in shape, and tended to agglomerate. The particles were typically clay minerals, including limestone, calcite, quartz, and potassium feldspar. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. Exponential blocking-temperature distribution in ferritin extracted from magnetization measurements

    NASA Astrophysics Data System (ADS)

    Lee, T. H.; Choi, K.-Y.; Kim, G.-H.; Suh, B. J.; Jang, Z. H.

    2014-11-01

    We developed a direct method to extract the zero-field zero-temperature anisotropy energy barrier distribution of magnetic particles in the form of a blocking-temperature distribution. The key idea is to modify measurement procedures slightly to make nonequilibrium magnetization calculations (including the time evolution of magnetization) easier. We applied this method to the biomagnetic molecule ferritin and successfully reproduced field-cool magnetization by using the extracted distribution. We find that the resulting distribution is more like an exponential type and that the distribution cannot be correlated simply to the widely known log-normal particle-size distribution. The method also allows us to determine the values of the zero-temperature coercivity and Bloch coefficient, which are in good agreement with those determined from other techniques.

  2. 47 CFR 76.1706 - Signal leakage logs and repair records.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... the probable cause of the leakage. The log shall be kept on file for a period of two years and shall... 47 Telecommunication 4 2010-10-01 2010-10-01 false Signal leakage logs and repair records. 76.1706... leakage logs and repair records. Cable operators shall maintain a log showing the date and location of...

  3. 47 CFR 76.1706 - Signal leakage logs and repair records.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... the probable cause of the leakage. The log shall be kept on file for a period of two years and shall... 47 Telecommunication 4 2011-10-01 2011-10-01 false Signal leakage logs and repair records. 76.1706... leakage logs and repair records. Cable operators shall maintain a log showing the date and location of...

  4. Evaluation and validity of a LORETA normative EEG database.

    PubMed

    Thatcher, R W; North, D; Biver, C

    2005-04-01

    To evaluate the reliability and validity of a Z-score normative EEG database for Low Resolution Electromagnetic Tomography (LORETA), EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) were acquired from 106 normal subjects, and the cross-spectrum was computed and multiplied by the Key Institute's LORETA 2,394 gray matter pixel T Matrix. After a log10 transform or a Box-Cox transform the mean and standard deviation of the *.lor files were computed for each of the 2394 gray matter pixels, from 1 to 30 Hz, for each of the subjects. Tests of Gaussianity were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of a Z-score database was computed by measuring the approximation to a Gaussian distribution. The validity of the LORETA normative database was evaluated by the degree to which confirmed brain pathologies were localized using the LORETA normative database. Log10 and Box-Cox transforms approximated Gaussian distribution in the range of 95.64% to 99.75% accuracy. The percentage of normative Z-score values at 2 standard deviations ranged from 1.21% to 3.54%, and the percentage of Z-scores at 3 standard deviations ranged from 0% to 0.83%. Left temporal lobe epilepsy, right sensory motor hematoma and a right hemisphere stroke exhibited maximum Z-score deviations in the same locations as the pathologies. We conclude: (1) Adequate approximation to a Gaussian distribution can be achieved using LORETA by using a log10 transform or a Box-Cox transform and parametric statistics, (2) a Z-Score normative database is valid with adequate sensitivity when using LORETA, and (3) the Z-score LORETA normative database also consistently localized known pathologies to the expected Brodmann areas as an hypothesis test based on the surface EEG before computing LORETA.

  5. Combining counts and incidence data: an efficient approach for estimating the log-normal species abundance distribution and diversity indices.

    PubMed

    Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G

    2012-10-01

    Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.

  6. Management of Listeria monocytogenes in fermented sausages using the Food Safety Objective concept underpinned by stochastic modeling and meta-analysis.

    PubMed

    Mataragas, M; Alessandria, V; Rantsiou, K; Cocolin, L

    2015-08-01

    In the present work, a demonstration is made on how the risk from the presence of Listeria monocytogenes in fermented sausages can be managed using the concept of Food Safety Objective (FSO) aided by stochastic modeling (Bayesian analysis and Monte Carlo simulation) and meta-analysis. For this purpose, the ICMSF equation was used, which combines the initial level (H0) of the hazard and its subsequent reduction (ΣR) and/or increase (ΣI) along the production chain. Each element of the equation was described by a distribution to investigate the effect not only of the level of the hazard, but also the effect of the accompanying variability. The distribution of each element was determined by Bayesian modeling (H0) and meta-analysis (ΣR and ΣI). The output was a normal distribution N(-5.36, 2.56) (log cfu/g) from which the percentage of the non-conforming products, i.e. the fraction above the FSO of 2 log cfu/g, was estimated at 0.202%. Different control measures were examined such as lowering initial L. monocytogenes level and inclusion of an additional killing step along the process resulting in reduction of the non-conforming products from 0.195% to 0.003% based on the mean and/or square-root change of the normal distribution, and 0.001%, respectively. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Evidence for tectonic, lithologic, and thermal controls on fracture system geometries in an andesitic high-temperature geothermal field

    NASA Astrophysics Data System (ADS)

    Massiot, Cécile; Nicol, Andrew; McNamara, David D.; Townend, John

    2017-08-01

    Analysis of fracture orientation, spacing, and thickness from acoustic borehole televiewer (BHTV) logs and cores in the andesite-hosted Rotokawa geothermal reservoir (New Zealand) highlights potential controls on the geometry of the fracture system. Cluster analysis of fracture orientations indicates four fracture sets. Probability distributions of fracture spacing and thickness measured on BHTV logs are estimated for each fracture set, using maximum likelihood estimations applied to truncated size distributions to account for sampling bias. Fracture spacing is dominantly lognormal, though two subordinate fracture sets have a power law spacing. This difference in spacing distributions may reflect the influence of the andesitic sequence stratification (lognormal) and tectonic faults (power law). Fracture thicknesses of 9-30 mm observed in BHTV logs, and 1-3 mm in cores, are interpreted to follow a power law. Fractures in thin sections (˜5 μm thick) do not fit this power law distribution, which, together with their orientation, reflect a change of controls on fracture thickness from uniform (such as thermal) controls at thin section scale to anisotropic (tectonic) at core and BHTV scales of observation. However, the ˜5% volumetric percentage of fractures within the rock at all three scales suggests a self-similar behavior in 3-D. Power law thickness distributions potentially associated with power law fluid flow rates, and increased connectivity where fracture sets intersect, may cause the large permeability variations that occur at hundred meter scales in the reservoir. The described fracture geometries can be incorporated into fracture and flow models to explore the roles of fracture connectivity, stress, and mineral precipitation/dissolution on permeability in such andesite-hosted geothermal systems.

  8. Comparative analysis of background EEG activity in childhood absence epilepsy during valproate treatment: a standardized, low-resolution, brain electromagnetic tomography (sLORETA) study.

    PubMed

    Shin, Jung-Hyun; Eom, Tae-Hoon; Kim, Young-Hoon; Chung, Seung-Yun; Lee, In-Goo; Kim, Jung-Min

    2017-07-01

    Valproate (VPA) is an antiepileptic drug (AED) used for initial monotherapy in treating childhood absence epilepsy (CAE). EEG might be an alternative approach to explore the effects of AEDs on the central nervous system. We performed a comparative analysis of background EEG activity during VPA treatment by using standardized, low-resolution, brain electromagnetic tomography (sLORETA) to explore the effect of VPA in patients with CAE. In 17 children with CAE, non-parametric statistical analyses using sLORETA were performed to compare the current density distribution of four frequency bands (delta, theta, alpha, and beta) between the untreated and treated condition. Maximum differences in current density were found in the left inferior frontal gyrus for the delta frequency band (log-F-ratio = -1.390, P > 0.05), the left medial frontal gyrus for the theta frequency band (log-F-ratio = -0.940, P > 0.05), the left inferior frontal gyrus for the alpha frequency band (log-F-ratio = -0.590, P > 0.05), and the left anterior cingulate for the beta frequency band (log-F-ratio = -1.318, P > 0.05). However, none of these differences were significant (threshold log-F-ratio = ±1.888, P < 0.01; threshold log-F-ratio = ±1.722, P < 0.05). Because EEG background is accepted as normal in CAE, VPA would not be expected to significantly change abnormal thalamocortical oscillations on a normal EEG background. Therefore, our results agree with currently accepted concepts but are not consistent with findings in some previous studies.

  9. Bayesian methods for uncertainty factor application for derivation of reference values.

    PubMed

    Simon, Ted W; Zhu, Yiliang; Dourson, Michael L; Beck, Nancy B

    2016-10-01

    In 2014, the National Research Council (NRC) published Review of EPA's Integrated Risk Information System (IRIS) Process that considers methods EPA uses for developing toxicity criteria for non-carcinogens. These criteria are the Reference Dose (RfD) for oral exposure and Reference Concentration (RfC) for inhalation exposure. The NRC Review suggested using Bayesian methods for application of uncertainty factors (UFs) to adjust the point of departure dose or concentration to a level considered to be without adverse effects for the human population. The NRC foresaw Bayesian methods would be potentially useful for combining toxicity data from disparate sources-high throughput assays, animal testing, and observational epidemiology. UFs represent five distinct areas for which both adjustment and consideration of uncertainty may be needed. NRC suggested UFs could be represented as Bayesian prior distributions, illustrated the use of a log-normal distribution to represent the composite UF, and combined this distribution with a log-normal distribution representing uncertainty in the point of departure (POD) to reflect the overall uncertainty. Here, we explore these suggestions and present a refinement of the methodology suggested by NRC that considers each individual UF as a distribution. From an examination of 24 evaluations from EPA's IRIS program, when individual UFs were represented using this approach, the geometric mean fold change in the value of the RfD or RfC increased from 3 to over 30, depending on the number of individual UFs used and the sophistication of the assessment. We present example calculations and recommendations for implementing the refined NRC methodology. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Two Universality Properties Associated with the Monkey Model of Zipf's Law

    NASA Astrophysics Data System (ADS)

    Perline, Richard; Perline, Ron

    2016-03-01

    The distribution of word probabilities in the monkey model of Zipf's law is associated with two universality properties: (1) the power law exponent converges strongly to $-1$ as the alphabet size increases and the letter probabilities are specified as the spacings from a random division of the unit interval for any distribution with a bounded density function on $[0,1]$; and (2), on a logarithmic scale the version of the model with a finite word length cutoff and unequal letter probabilities is approximately normally distributed in the part of the distribution away from the tails. The first property is proved using a remarkably general limit theorem for the logarithm of sample spacings from Shao and Hahn, and the second property follows from Anscombe's central limit theorem for a random number of i.i.d. random variables. The finite word length model leads to a hybrid Zipf-lognormal mixture distribution closely related to work in other areas.

  11. Statistical hypothesis tests of some micrometeorological observations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    SethuRaman, S.; Tichler, J.

    Chi-square goodness-of-fit is used to test the hypothesis that the medium scale of turbulence in the atmospheric surface layer is normally distributed. Coefficients of skewness and excess are computed from the data. If the data are not normal, these coefficients are used in Edgeworth's asymptotic expansion of Gram-Charlier series to determine an altrnate probability density function. The observed data are then compared with the modified probability densities and the new chi-square values computed.Seventy percent of the data analyzed was either normal or approximatley normal. The coefficient of skewness g/sub 1/ has a good correlation with the chi-square values. Events withmore » vertical-barg/sub 1/vertical-bar<0.21 were normal to begin with and those with 0.21« less

  12. Carotid artery intima-media thickness measurement in children with normal and increased body mass index: a comparison of three techniques.

    PubMed

    El Jalbout, Ramy; Cloutier, Guy; Cardinal, Marie-Hélène Roy; Henderson, Mélanie; Lapierre, Chantale; Soulez, Gilles; Dubois, Josée

    2018-05-09

    Common carotid artery intima-media thickness is a marker of subclinical atherosclerosis. In children, increased intima-media thickness is associated with obesity and the risk of cardiovascular events in adulthood. To compare intima-media thickness measurements using B-mode ultrasound, radiofrequency (RF) echo tracking, and RF speckle probability distribution in children with normal and increased body mass index (BMI). We prospectively measured intima-media thickness in 120 children randomly selected from two groups of a longitudinal cohort: normal BMI and increased BMI, defined by BMI ≥85th percentile for age and gender. We followed Mannheim recommendations. We used M'Ath-Std for automated B-mode imaging, M-line processing of RF signal amplitude for RF echo tracking, and RF signal segmentation and averaging using probability distributions defining image speckle. Statistical analysis included Wilcoxon and Mann-Whitney tests, and Pearson correlation coefficient and intra-class correlation coefficient (ICC). Children were 10-13 years old (mean: 11.7 years); 61% were boys. The mean age was 11.4 years (range: 10.0-13.1 years) for the normal BMI group and 12.0 years (range: 10.1-13.5 years) for the increased BMI group. The normal BMI group included 58% boys and the increased BMI group 63% boys. RF echo tracking method was successful in 79 children as opposed to 114 for the B-mode method and all 120 for the probability distribution method. Techniques were weakly correlated: ICC=0.34 (95% confidence interval [CI]: 0.27-0.39). Intima-media thickness was significantly higher in the increased BMI than normal BMI group using the RF techniques and borderline for the B-mode technique. Mean differences between weight groups were: B-mode, 0.02 mm (95% CI: 0.00 to 0.04), P=0.05; RF echo tracking, 0.03 mm (95% CI: 0.01 to 0.05), P=0.01; and RF speckle probability distribution, 0.03 mm (95% CI: 0.01 to 0.05), P=0.002. Though techniques are not interchangeable, all showed increased intima-media thickness in children with increased BMI. RF echo tracking method had the lowest success rate at calculating intima-media thickness. For patient follow-up and cohort comparisons, the same technique should be used throughout.

  13. Chaotic advection at large Péclet number: Electromagnetically driven experiments, numerical simulations, and theoretical predictions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Figueroa, Aldo; Meunier, Patrice; Villermaux, Emmanuel

    2014-01-15

    We present a combination of experiment, theory, and modelling on laminar mixing at large Péclet number. The flow is produced by oscillating electromagnetic forces in a thin electrolytic fluid layer, leading to oscillating dipoles, quadrupoles, octopoles, and disordered flows. The numerical simulations are based on the Diffusive Strip Method (DSM) which was recently introduced (P. Meunier and E. Villermaux, “The diffusive strip method for scalar mixing in two-dimensions,” J. Fluid Mech. 662, 134–172 (2010)) to solve the advection-diffusion problem by combining Lagrangian techniques and theoretical modelling of the diffusion. Numerical simulations obtained with the DSM are in reasonable agreement withmore » quantitative dye visualization experiments of the scalar fields. A theoretical model based on log-normal Probability Density Functions (PDFs) of stretching factors, characteristic of homogeneous turbulence in the Batchelor regime, allows to predict the PDFs of scalar in agreement with numerical and experimental results. This model also indicates that the PDFs of scalar are asymptotically close to log-normal at late stages, except for the large concentration levels which correspond to low stretching factors.« less

  14. A Random Variable Transformation Process.

    ERIC Educational Resources Information Center

    Scheuermann, Larry

    1989-01-01

    Provides a short BASIC program, RANVAR, which generates random variates for various theoretical probability distributions. The seven variates include: uniform, exponential, normal, binomial, Poisson, Pascal, and triangular. (MVL)

  15. Joint Stochastic Inversion of Pre-Stack 3D Seismic Data and Well Logs for High Resolution Hydrocarbon Reservoir Characterization

    NASA Astrophysics Data System (ADS)

    Torres-Verdin, C.

    2007-05-01

    This paper describes the successful implementation of a new 3D AVA stochastic inversion algorithm to quantitatively integrate pre-stack seismic amplitude data and well logs. The stochastic inversion algorithm is used to characterize flow units of a deepwater reservoir located in the central Gulf of Mexico. Conventional fluid/lithology sensitivity analysis indicates that the shale/sand interface represented by the top of the hydrocarbon-bearing turbidite deposits generates typical Class III AVA responses. On the other hand, layer- dependent Biot-Gassmann analysis shows significant sensitivity of the P-wave velocity and density to fluid substitution. Accordingly, AVA stochastic inversion, which combines the advantages of AVA analysis with those of geostatistical inversion, provided quantitative information about the lateral continuity of the turbidite reservoirs based on the interpretation of inverted acoustic properties (P-velocity, S-velocity, density), and lithotype (sand- shale) distributions. The quantitative use of rock/fluid information through AVA seismic amplitude data, coupled with the implementation of co-simulation via lithotype-dependent multidimensional joint probability distributions of acoustic/petrophysical properties, yields accurate 3D models of petrophysical properties such as porosity and permeability. Finally, by fully integrating pre-stack seismic amplitude data and well logs, the vertical resolution of inverted products is higher than that of deterministic inversions methods.

  16. Thorium normalization as a hydrocarbon accumulation indicator for Lower Miocene rocks in Ras Ghara area, Gulf of Suez, Egypt

    NASA Astrophysics Data System (ADS)

    El-Khadragy, A. A.; Shazly, T. F.; AlAlfy, I. M.; Ramadan, M.; El-Sawy, M. Z.

    2018-06-01

    An exploration method has been developed using surface and aerial gamma-ray spectral measurements in prospecting petroleum in stratigraphic and structural traps. The Gulf of Suez is an important region for studying hydrocarbon potentiality in Egypt. Thorium normalization technique was applied on the sandstone reservoirs in the region to determine the hydrocarbon potentialities zones using the three spectrometric radioactive gamma ray-logs (eU, eTh and K% logs). This method was applied on the recorded gamma-ray spectrometric logs for Rudeis and Kareem Formations in Ras Ghara oil Field, Gulf of Suez, Egypt. The conventional well logs (gamma-ray, resistivity, neutron, density and sonic logs) were analyzed to determine the net pay zones in the study area. The agreement ratios between the thorium normalization technique and the results of the well log analyses are high, so the application of thorium normalization technique can be used as a guide for hydrocarbon accumulation in the study reservoir rocks.

  17. Probability density cloud as a geometrical tool to describe statistics of scattered light.

    PubMed

    Yaitskova, Natalia

    2017-04-01

    First-order statistics of scattered light is described using the representation of the probability density cloud, which visualizes a two-dimensional distribution for complex amplitude. The geometric parameters of the cloud are studied in detail and are connected to the statistical properties of phase. The moment-generating function for intensity is obtained in a closed form through these parameters. An example of exponentially modified normal distribution is provided to illustrate the functioning of this geometrical approach.

  18. A probabilistic approach to photovoltaic generator performance prediction

    NASA Astrophysics Data System (ADS)

    Khallat, M. A.; Rahman, S.

    1986-09-01

    A method for predicting the performance of a photovoltaic (PV) generator based on long term climatological data and expected cell performance is described. The equations for cell model formulation are provided. Use of the statistical model for characterizing the insolation level is discussed. The insolation data is fitted to appropriate probability distribution functions (Weibull, beta, normal). The probability distribution functions are utilized to evaluate the capacity factors of PV panels or arrays. An example is presented revealing the applicability of the procedure.

  19. Faster search by lackadaisical quantum walk

    NASA Astrophysics Data System (ADS)

    Wong, Thomas G.

    2018-03-01

    In the typical model, a discrete-time coined quantum walk searching the 2D grid for a marked vertex achieves a success probability of O(1/log N) in O(√{N log N}) steps, which with amplitude amplification yields an overall runtime of O(√{N} log N). We show that making the quantum walk lackadaisical or lazy by adding a self-loop of weight 4 / N to each vertex speeds up the search, causing the success probability to reach a constant near 1 in O(√{N log N}) steps, thus yielding an O(√{log N}) improvement over the typical, loopless algorithm. This improved runtime matches the best known quantum algorithms for this search problem. Our results are based on numerical simulations since the algorithm is not an instance of the abstract search algorithm.

  20. Kullback-Leibler information function and the sequential selection of experiments to discriminate among several linear models

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    The error variance of the process prior multivariate normal distributions of the parameters of the models are assumed to be specified, prior probabilities of the models being correct. A rule for termination of sampling is proposed. Upon termination, the model with the largest posterior probability is chosen as correct. If sampling is not terminated, posterior probabilities of the models and posterior distributions of the parameters are computed. An experiment was chosen to maximize the expected Kullback-Leibler information function. Monte Carlo simulation experiments were performed to investigate large and small sample behavior of the sequential adaptive procedure.

  1. Transformation of arbitrary distributions to the normal distribution with application to EEG test-retest reliability.

    PubMed

    van Albada, S J; Robinson, P A

    2007-04-15

    Many variables in the social, physical, and biosciences, including neuroscience, are non-normally distributed. To improve the statistical properties of such data, or to allow parametric testing, logarithmic or logit transformations are often used. Box-Cox transformations or ad hoc methods are sometimes used for parameters for which no transformation is known to approximate normality. However, these methods do not always give good agreement with the Gaussian. A transformation is discussed that maps probability distributions as closely as possible to the normal distribution, with exact agreement for continuous distributions. To illustrate, the transformation is applied to a theoretical distribution, and to quantitative electroencephalographic (qEEG) measures from repeat recordings of 32 subjects which are highly non-normal. Agreement with the Gaussian was better than using logarithmic, logit, or Box-Cox transformations. Since normal data have previously been shown to have better test-retest reliability than non-normal data under fairly general circumstances, the implications of our transformation for the test-retest reliability of parameters were investigated. Reliability was shown to improve with the transformation, where the improvement was comparable to that using Box-Cox. An advantage of the general transformation is that it does not require laborious optimization over a range of parameters or a case-specific choice of form.

  2. Wavefront-Guided Scleral Lens Correction in Keratoconus

    PubMed Central

    Marsack, Jason D.; Ravikumar, Ayeswarya; Nguyen, Chi; Ticak, Anita; Koenig, Darren E.; Elswick, James D.; Applegate, Raymond A.

    2014-01-01

    Purpose To examine the performance of state-of-the-art wavefront-guided scleral contact lenses (wfgSCLs) on a sample of keratoconic eyes, with emphasis on performance quantified with visual quality metrics; and to provide a detailed discussion of the process used to design, manufacture and evaluate wfgSCLs. Methods Fourteen eyes of 7 subjects with keratoconus were enrolled and a wfgSCL was designed for each eye. High-contrast visual acuity and visual quality metrics were used to assess the on-eye performance of the lenses. Results The wfgSCL provided statistically lower levels of both lower-order RMS (p < 0.001) and higher-order RMS (p < 0.02) than an intermediate spherical equivalent scleral contact lens. The wfgSCL provided lower levels of lower-order RMS than a normal group of well-corrected observers (p < < 0.001). However, the wfgSCL does not provide less higher-order RMS than the normal group (p = 0.41). Of the 14 eyes studied, 10 successfully reached the exit criteria, achieving residual higher-order root mean square wavefront error (HORMS) less than or within 1 SD of the levels experienced by normal, age-matched subjects. In addition, measures of visual image quality (logVSX, logNS and logLIB) for the 10 eyes were well distributed within the range of values seen in normal eyes. However, visual performance as measured by high contrast acuity did not reach normal, age-matched levels, which is in agreement with prior results associated with the acute application of wavefront correction to KC eyes. Conclusions Wavefront-guided scleral contact lenses are capable of optically compensating for the deleterious effects of higher-order aberration concomitant with the disease, and can provide visual image quality equivalent to that seen in normal eyes. Longer duration studies are needed to assess whether the visual system of the highly aberrated eye wearing a wfgSCL is capable of producing visual performance levels typical of the normal population. PMID:24830371

  3. Vacuum quantum stress tensor fluctuations: A diagonalization approach

    NASA Astrophysics Data System (ADS)

    Schiappacasse, Enrico D.; Fewster, Christopher J.; Ford, L. H.

    2018-01-01

    Large vacuum fluctuations of a quantum stress tensor can be described by the asymptotic behavior of its probability distribution. Here we focus on stress tensor operators which have been averaged with a sampling function in time. The Minkowski vacuum state is not an eigenstate of the time-averaged operator, but can be expanded in terms of its eigenstates. We calculate the probability distribution and the cumulative probability distribution for obtaining a given value in a measurement of the time-averaged operator taken in the vacuum state. In these calculations, we study a specific operator that contributes to the stress-energy tensor of a massless scalar field in Minkowski spacetime, namely, the normal ordered square of the time derivative of the field. We analyze the rate of decrease of the tail of the probability distribution for different temporal sampling functions, such as compactly supported functions and the Lorentzian function. We find that the tails decrease relatively slowly, as exponentials of fractional powers, in agreement with previous work using the moments of the distribution. Our results lend additional support to the conclusion that large vacuum stress tensor fluctuations are more probable than large thermal fluctuations, and may have observable effects.

  4. Quantitative evaluation of Alzheimer's disease

    NASA Astrophysics Data System (ADS)

    Duchesne, S.; Frisoni, G. B.

    2009-02-01

    We propose a single, quantitative metric called the disease evaluation factor (DEF) and assess its efficiency at estimating disease burden in normal, control subjects (CTRL) and probable Alzheimer's disease (AD) patients. The study group consisted in 75 patients with a diagnosis of probable AD and 75 age-matched normal CTRL without neurological or neuropsychological deficit. We calculated a reference eigenspace of MRI appearance from reference data, in which our CTRL and probable AD subjects were projected. We then calculated the multi-dimensional hyperplane separating the CTRL and probable AD groups. The DEF was estimated via a multidimensional weighted distance of eigencoordinates for a given subject and the CTRL group mean, along salient principal components forming the separating hyperplane. We used quantile plots, Kolmogorov-Smirnov and χ2 tests to compare the DEF values and test that their distribution was normal. We used a linear discriminant test to separate CTRL from probable AD based on the DEF factor, and reached an accuracy of 87%. A quantitative biomarker in AD would act as an important surrogate marker of disease status and progression.

  5. Postfragmentation density function for bacterial aggregates in laminar flow.

    PubMed

    Byrne, Erin; Dzul, Steve; Solomon, Michael; Younger, John; Bortz, David M

    2011-04-01

    The postfragmentation probability density of daughter flocs is one of the least well-understood aspects of modeling flocculation. We use three-dimensional positional data of Klebsiella pneumoniae bacterial flocs in suspension and the knowledge of hydrodynamic properties of a laminar flow field to construct a probability density function of floc volumes after a fragmentation event. We provide computational results which predict that the primary fragmentation mechanism for large flocs is erosion. The postfragmentation probability density function has a strong dependence on the size of the original floc and indicates that most fragmentation events result in clumps of one to three bacteria eroding from the original floc. We also provide numerical evidence that exhaustive fragmentation yields a limiting density inconsistent with the log-normal density predicted in the literature, most likely due to the heterogeneous nature of K. pneumoniae flocs. To support our conclusions, artificial flocs were generated and display similar postfragmentation density and exhaustive fragmentation. ©2011 American Physical Society

  6. New S control chart using skewness correction method for monitoring process dispersion of skewed distributions

    NASA Astrophysics Data System (ADS)

    Atta, Abdu; Yahaya, Sharipah; Zain, Zakiyah; Ahmed, Zalikha

    2017-11-01

    Control chart is established as one of the most powerful tools in Statistical Process Control (SPC) and is widely used in industries. The conventional control charts rely on normality assumption, which is not always the case for industrial data. This paper proposes a new S control chart for monitoring process dispersion using skewness correction method for skewed distributions, named as SC-S control chart. Its performance in terms of false alarm rate is compared with various existing control charts for monitoring process dispersion, such as scaled weighted variance S chart (SWV-S); skewness correction R chart (SC-R); weighted variance R chart (WV-R); weighted variance S chart (WV-S); and standard S chart (STD-S). Comparison with exact S control chart with regards to the probability of out-of-control detections is also accomplished. The Weibull and gamma distributions adopted in this study are assessed along with the normal distribution. Simulation study shows that the proposed SC-S control chart provides good performance of in-control probabilities (Type I error) in almost all the skewness levels and sample sizes, n. In the case of probability of detection shift the proposed SC-S chart is closer to the exact S control chart than the existing charts for skewed distributions, except for the SC-R control chart. In general, the performance of the proposed SC-S control chart is better than all the existing control charts for monitoring process dispersion in the cases of Type I error and probability of detection shift.

  7. SU-E-J-182: Reproducibility of Tumor Motion Probability Distribution Function in Stereotactic Body Radiation Therapy of Lung Using Real-Time Tumor-Tracking Radiotherapy System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shiinoki, T; Hanazawa, H; Park, S

    2015-06-15

    Purpose: We aim to achieve new four-dimensional radiotherapy (4DRT) using the next generation real-time tumor-tracking (RTRT) system and flattening-filter-free techniques. To achieve new 4DRT, it is necessary to understand the respiratory motion of tumor. The purposes of this study were: 1.To develop the respiratory motion analysis tool using log files. 2.To evaluate the reproducibility of tumor motion probability distribution function (PDF) during stereotactic body RT (SBRT) of lung tumor. Methods: Seven patients having fiducial markers closely implanted to the lung tumor were enrolled in this study. The positions of fiducial markers were measured using the RTRT system (Mitsubishi Electronics Co.,more » JP) and recorded as two types of log files during the course of SBRT. For each patients, tumor motion range and tumor motion PDFs in left-right (LR), anterior-posterior (AP) and superior-inferior (SI) directions were calculated using log files of all beams per fraction (PDFn). Fractional PDF reproducibility (Rn) was calculated as Kullback-Leibler (KL) divergence between PDF1 and PDFn of tumor motion. The mean of Rn (Rm) was calculated for each patient and correlated to the patient’s mean tumor motion range (Am). The change of Rm during the course of SBRT was also evluated. These analyses were performed using in-house developed software. Results: The Rm were 0.19 (0.07–0.30), 0.14 (0.07–0.32) and 0.16 (0.09–0.28) in LR, AP and SI directions, respectively. The Am were 5.11 mm (2.58–9.99 mm), 7.81 mm (2.87–15.57 mm) and 11.26 mm (3.80–21.27 mm) in LR, AP and SI directions, respectively. The PDF reproducibility decreased as the tumor motion range increased in AP and SI direction. That decreased slightly through the course of RT in SI direction. Conclusion: We developed the respiratory motion analysis tool for 4DRT using log files and quantified the range and reproducibility of respiratory motion for lung tumors.« less

  8. Bayesian analysis of stochastic volatility-in-mean model with leverage and asymmetrically heavy-tailed error using generalized hyperbolic skew Student’s t-distribution*

    PubMed Central

    Leão, William L.; Chen, Ming-Hui

    2017-01-01

    A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor’s 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model. PMID:29333210

  9. Bayesian analysis of stochastic volatility-in-mean model with leverage and asymmetrically heavy-tailed error using generalized hyperbolic skew Student's t-distribution.

    PubMed

    Leão, William L; Abanto-Valle, Carlos A; Chen, Ming-Hui

    2017-01-01

    A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor's 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model.

  10. Modeling pore corrosion in normally open gold- plated copper connectors.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Battaile, Corbett Chandler; Moffat, Harry K.; Sun, Amy Cha-Tien

    2008-09-01

    The goal of this study is to model the electrical response of gold plated copper electrical contacts exposed to a mixed flowing gas stream consisting of air containing 10 ppb H{sub 2}S at 30 C and a relative humidity of 70%. This environment accelerates the attack normally observed in a light industrial environment (essentially a simplified version of the Battelle Class 2 environment). Corrosion rates were quantified by measuring the corrosion site density, size distribution, and the macroscopic electrical resistance of the aged surface as a function of exposure time. A pore corrosion numerical model was used to predict bothmore » the growth of copper sulfide corrosion product which blooms through defects in the gold layer and the resulting electrical contact resistance of the aged surface. Assumptions about the distribution of defects in the noble metal plating and the mechanism for how corrosion blooms affect electrical contact resistance were needed to complete the numerical model. Comparisons are made to the experimentally observed number density of corrosion sites, the size distribution of corrosion product blooms, and the cumulative probability distribution of the electrical contact resistance. Experimentally, the bloom site density increases as a function of time, whereas the bloom size distribution remains relatively independent of time. These two effects are included in the numerical model by adding a corrosion initiation probability proportional to the surface area along with a probability for bloom-growth extinction proportional to the corrosion product bloom volume. The cumulative probability distribution of electrical resistance becomes skewed as exposure time increases. While the electrical contact resistance increases as a function of time for a fraction of the bloom population, the median value remains relatively unchanged. In order to model this behavior, the resistance calculated for large blooms has been weighted more heavily.« less

  11. Evaluation of bacterial run and tumble motility parameters through trajectory analysis

    NASA Astrophysics Data System (ADS)

    Liang, Xiaomeng; Lu, Nanxi; Chang, Lin-Ching; Nguyen, Thanh H.; Massoudieh, Arash

    2018-04-01

    In this paper, a method for extraction of the behavior parameters of bacterial migration based on the run and tumble conceptual model is described. The methodology is applied to the microscopic images representing the motile movement of flagellated Azotobacter vinelandii. The bacterial cells are considered to change direction during both runs and tumbles as is evident from the movement trajectories. An unsupervised cluster analysis was performed to fractionate each bacterial trajectory into run and tumble segments, and then the distribution of parameters for each mode were extracted by fitting mathematical distributions best representing the data. A Gaussian copula was used to model the autocorrelation in swimming velocity. For both run and tumble modes, Gamma distribution was found to fit the marginal velocity best, and Logistic distribution was found to represent better the deviation angle than other distributions considered. For the transition rate distribution, log-logistic distribution and log-normal distribution, respectively, was found to do a better job than the traditionally agreed exponential distribution. A model was then developed to mimic the motility behavior of bacteria at the presence of flow. The model was applied to evaluate its ability to describe observed patterns of bacterial deposition on surfaces in a micro-model experiment with an approach velocity of 200 μm/s. It was found that the model can qualitatively reproduce the attachment results of the micro-model setting.

  12. Normal probabilities for Vandenberg AFB wind components - monthly reference periods for all flight azimuths, 0- to 70-km altitudes

    NASA Technical Reports Server (NTRS)

    Falls, L. W.

    1975-01-01

    Vandenberg Air Force Base (AFB), California, wind component statistics are presented to be used for aerospace engineering applications that require component wind probabilities for various flight azimuths and selected altitudes. The normal (Gaussian) distribution is presented as a statistical model to represent component winds at Vandenberg AFB. Head tail, and crosswind components are tabulated for all flight azimuths for altitudes from 0 to 70 km by monthly reference periods. Wind components are given for 11 selected percentiles ranging from 0.135 percent to 99.865 percent for each month. The results of statistical goodness-of-fit tests are presented to verify the use of the Gaussian distribution as an adequate model to represent component winds at Vandenberg AFB.

  13. Normal probabilities for Cape Kennedy wind components: Monthly reference periods for all flight azimuths. Altitudes 0 to 70 kilometers

    NASA Technical Reports Server (NTRS)

    Falls, L. W.

    1973-01-01

    This document replaces Cape Kennedy empirical wind component statistics which are presently being used for aerospace engineering applications that require component wind probabilities for various flight azimuths and selected altitudes. The normal (Gaussian) distribution is presented as an adequate statistical model to represent component winds at Cape Kennedy. Head-, tail-, and crosswind components are tabulated for all flight azimuths for altitudes from 0 to 70 km by monthly reference periods. Wind components are given for 11 selected percentiles ranging from 0.135 percent to 99,865 percent for each month. Results of statistical goodness-of-fit tests are presented to verify the use of the Gaussian distribution as an adequate model to represent component winds at Cape Kennedy, Florida.

  14. Aggregate and Individual Replication Probability within an Explicit Model of the Research Process

    ERIC Educational Resources Information Center

    Miller, Jeff; Schwarz, Wolf

    2011-01-01

    We study a model of the research process in which the true effect size, the replication jitter due to changes in experimental procedure, and the statistical error of effect size measurement are all normally distributed random variables. Within this model, we analyze the probability of successfully replicating an initial experimental result by…

  15. New spatial upscaling methods for multi-point measurements: From normal to p-normal

    NASA Astrophysics Data System (ADS)

    Liu, Feng; Li, Xin

    2017-12-01

    Careful attention must be given to determining whether the geophysical variables of interest are normally distributed, since the assumption of a normal distribution may not accurately reflect the probability distribution of some variables. As a generalization of the normal distribution, the p-normal distribution and its corresponding maximum likelihood estimation (the least power estimation, LPE) were introduced in upscaling methods for multi-point measurements. Six methods, including three normal-based methods, i.e., arithmetic average, least square estimation, block kriging, and three p-normal-based methods, i.e., LPE, geostatistics LPE and inverse distance weighted LPE are compared in two types of experiments: a synthetic experiment to evaluate the performance of the upscaling methods in terms of accuracy, stability and robustness, and a real-world experiment to produce real-world upscaling estimates using soil moisture data obtained from multi-scale observations. The results show that the p-normal-based methods produced lower mean absolute errors and outperformed the other techniques due to their universality and robustness. We conclude that introducing appropriate statistical parameters into an upscaling strategy can substantially improve the estimation, especially if the raw measurements are disorganized; however, further investigation is required to determine which parameter is the most effective among variance, spatial correlation information and parameter p.

  16. Probability theory for 3-layer remote sensing radiative transfer model: univariate case.

    PubMed

    Ben-David, Avishai; Davidson, Charles E

    2012-04-23

    A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics. © 2012 Optical Society of America

  17. Extreme Mean and Its Applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.

    1979-01-01

    Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.

  18. Probability distributions of the electroencephalogram envelope of preterm infants.

    PubMed

    Saji, Ryoya; Hirasawa, Kyoko; Ito, Masako; Kusuda, Satoshi; Konishi, Yukuo; Taga, Gentaro

    2015-06-01

    To determine the stationary characteristics of electroencephalogram (EEG) envelopes for prematurely born (preterm) infants and investigate the intrinsic characteristics of early brain development in preterm infants. Twenty neurologically normal sets of EEGs recorded in infants with a post-conceptional age (PCA) range of 26-44 weeks (mean 37.5 ± 5.0 weeks) were analyzed. Hilbert transform was applied to extract the envelope. We determined the suitable probability distribution of the envelope and performed a statistical analysis. It was found that (i) the probability distributions for preterm EEG envelopes were best fitted by lognormal distributions at 38 weeks PCA or less, and by gamma distributions at 44 weeks PCA; (ii) the scale parameter of the lognormal distribution had positive correlations with PCA as well as a strong negative correlation with the percentage of low-voltage activity; (iii) the shape parameter of the lognormal distribution had significant positive correlations with PCA; (iv) the statistics of mode showed significant linear relationships with PCA, and, therefore, it was considered a useful index in PCA prediction. These statistics, including the scale parameter of the lognormal distribution and the skewness and mode derived from a suitable probability distribution, may be good indexes for estimating stationary nature in developing brain activity in preterm infants. The stationary characteristics, such as discontinuity, asymmetry, and unimodality, of preterm EEGs are well indicated by the statistics estimated from the probability distribution of the preterm EEG envelopes. Copyright © 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  19. How to Introduce Historically the Normal Distribution in Engineering Education: A Classroom Experiment

    ERIC Educational Resources Information Center

    Blanco, Monica; Ginovart, Marta

    2010-01-01

    Little has been explored with regard to introducing historical aspects in the undergraduate statistics classroom in engineering studies. This article focuses on the design, implementation and assessment of a specific activity concerning the introduction of the normal probability curve and related aspects from a historical dimension. Following a…

  20. Correlation between size distribution and luminescence properties of spool-shaped InAs quantum dots

    NASA Astrophysics Data System (ADS)

    Xie, H.; Prioli, R.; Torelly, G.; Liu, H.; Fischer, A. M.; Jakomin, R.; Mourão, R.; Kawabata, R.; Pires, M. P.; Souza, P. L.; Ponce, F. A.

    2017-05-01

    InAs QDs embedded in an AlGaAs matrix have been produced by MOVPE with a partial capping and annealing technique to achieve controllable QD energy levels that could be useful for solar cell applications. The resulted spool-shaped QDs are around 5 nm in height and have a log-normal diameter distribution, which is observed by TEM to range from 5 to 15 nm. Two photoluminescence peaks associated with QD emission are attributed to the ground and the first excited states transitions. The luminescence peak width is correlated with the distribution of QD diameters through the diameter dependent QD energy levels.

  1. Bubble-chip analysis of human origin distributions demonstrates on a genomic scale significant clustering into zones and significant association with transcription

    PubMed Central

    Mesner, Larry D.; Valsakumar, Veena; Karnani, Neerja; Dutta, Anindya; Hamlin, Joyce L.; Bekiranov, Stefan

    2011-01-01

    We have used a novel bubble-trapping procedure to construct nearly pure and comprehensive human origin libraries from early S- and log-phase HeLa cells, and from log-phase GM06990, a karyotypically normal lymphoblastoid cell line. When hybridized to ENCODE tiling arrays, these libraries illuminated 15.3%, 16.4%, and 21.8% of the genome in the ENCODE regions, respectively. Approximately half of the origin fragments cluster into zones, and their signals are generally higher than those of isolated fragments. Interestingly, initiation events are distributed about equally between genic and intergenic template sequences. While only 13.2% and 14.0% of genes within the ENCODE regions are actually transcribed in HeLa and GM06990 cells, 54.5% and 25.6% of zonal origin fragments overlap transcribed genes, most with activating chromatin marks in their promoters. Our data suggest that cell synchronization activates a significant number of inchoate origins. In addition, HeLa and GM06990 cells activate remarkably different origin populations. Finally, there is only moderate concordance between the log-phase HeLa bubble map and published maps of small nascent strands for this cell line. PMID:21173031

  2. Log-gamma linear-mixed effects models for multiple outcomes with application to a longitudinal glaucoma study

    PubMed Central

    Zhang, Peng; Luo, Dandan; Li, Pengfei; Sharpsten, Lucie; Medeiros, Felipe A.

    2015-01-01

    Glaucoma is a progressive disease due to damage in the optic nerve with associated functional losses. Although the relationship between structural and functional progression in glaucoma is well established, there is disagreement on how this association evolves over time. In addressing this issue, we propose a new class of non-Gaussian linear-mixed models to estimate the correlations among subject-specific effects in multivariate longitudinal studies with a skewed distribution of random effects, to be used in a study of glaucoma. This class provides an efficient estimation of subject-specific effects by modeling the skewed random effects through the log-gamma distribution. It also provides more reliable estimates of the correlations between the random effects. To validate the log-gamma assumption against the usual normality assumption of the random effects, we propose a lack-of-fit test using the profile likelihood function of the shape parameter. We apply this method to data from a prospective observation study, the Diagnostic Innovations in Glaucoma Study, to present a statistically significant association between structural and functional change rates that leads to a better understanding of the progression of glaucoma over time. PMID:26075565

  3. Development of a voltage-dependent current noise algorithm for conductance-based stochastic modelling of auditory nerve fibres.

    PubMed

    Badenhorst, Werner; Hanekom, Tania; Hanekom, Johan J

    2016-12-01

    This study presents the development of an alternative noise current term and novel voltage-dependent current noise algorithm for conductance-based stochastic auditory nerve fibre (ANF) models. ANFs are known to have significant variance in threshold stimulus which affects temporal characteristics such as latency. This variance is primarily caused by the stochastic behaviour or microscopic fluctuations of the node of Ranvier's voltage-dependent sodium channels of which the intensity is a function of membrane voltage. Though easy to implement and low in computational cost, existing current noise models have two deficiencies: it is independent of membrane voltage, and it is unable to inherently determine the noise intensity required to produce in vivo measured discharge probability functions. The proposed algorithm overcomes these deficiencies while maintaining its low computational cost and ease of implementation compared to other conductance and Markovian-based stochastic models. The algorithm is applied to a Hodgkin-Huxley-based compartmental cat ANF model and validated via comparison of the threshold probability and latency distributions to measured cat ANF data. Simulation results show the algorithm's adherence to in vivo stochastic fibre characteristics such as an exponential relationship between the membrane noise and transmembrane voltage, a negative linear relationship between the log of the relative spread of the discharge probability and the log of the fibre diameter and a decrease in latency with an increase in stimulus intensity.

  4. Applications of multiscale change point detections to monthly stream flow and rainfall in Xijiang River in southern China, part I: correlation and variance

    NASA Astrophysics Data System (ADS)

    Zhu, Yuxiang; Jiang, Jianmin; Huang, Changxing; Chen, Yongqin David; Zhang, Qiang

    2018-04-01

    This article, as part I, introduces three algorithms and applies them to both series of the monthly stream flow and rainfall in Xijiang River, southern China. The three algorithms include (1) normalization of probability distribution, (2) scanning U test for change points in correlation between two time series, and (3) scanning F-test for change points in variances. The normalization algorithm adopts the quantile method to normalize data from a non-normal into the normal probability distribution. The scanning U test and F-test have three common features: grafting the classical statistics onto the wavelet algorithm, adding corrections for independence into each statistic criteria at given confidence respectively, and being almost objective and automatic detection on multiscale time scales. In addition, the coherency analyses between two series are also carried out for changes in variance. The application results show that the changes of the monthly discharge are still controlled by natural precipitation variations in Xijiang's fluvial system. Human activities disturbed the ecological balance perhaps in certain content and in shorter spells but did not violate the natural relationships of correlation and variance changes so far.

  5. Neural network prediction of carbonate lithofacies from well logs, Big Bow and Sand Arroyo Creek fields, Southwest Kansas

    USGS Publications Warehouse

    Qi, L.; Carr, T.R.

    2006-01-01

    In the Hugoton Embayment of southwestern Kansas, St. Louis Limestone reservoirs have relatively low recovery efficiencies, attributed to the heterogeneous nature of the oolitic deposits. This study establishes quantitative relationships between digital well logs and core description data, and applies these relationships in a probabilistic sense to predict lithofacies in 90 uncored wells across the Big Bow and Sand Arroyo Creek fields. In 10 wells, a single hidden-layer neural network based on digital well logs and core described lithofacies of the limestone depositional texture was used to train and establish a non-linear relationship between lithofacies assignments from detailed core descriptions and selected log curves. Neural network models were optimized by selecting six predictor variables and automated cross-validation with neural network parameters and then used to predict lithofacies on the whole data set of the 2023 half-foot intervals from the 10 cored wells with the selected network size of 35 and a damping parameter of 0.01. Predicted lithofacies results compared to actual lithofacies displays absolute accuracies of 70.37-90.82%. Incorporating adjoining lithofacies, within-one lithofacies improves accuracy slightly (93.72%). Digital logs from uncored wells were batch processed to predict lithofacies and probabilities related to each lithofacies at half-foot resolution corresponding to log units. The results were used to construct interpolated cross-sections and useful depositional patterns of St. Louis lithofacies were illustrated, e.g., the concentration of oolitic deposits (including lithofacies 5 and 6) along local highs and the relative dominance of quartz-rich carbonate grainstone (lithofacies 1) in the zones A and B of the St. Louis Limestone. Neural network techniques are applicable to other complex reservoirs, in which facies geometry and distribution are the key factors controlling heterogeneity and distribution of rock properties. Future work involves extension of the neural network to predict reservoir properties, and construction of three-dimensional geo-models. ?? 2005 Elsevier Ltd. All rights reserved.

  6. Power law versus exponential state transition dynamics: application to sleep-wake architecture.

    PubMed

    Chu-Shore, Jesse; Westover, M Brandon; Bianchi, Matt T

    2010-12-02

    Despite the common experience that interrupted sleep has a negative impact on waking function, the features of human sleep-wake architecture that best distinguish sleep continuity versus fragmentation remain elusive. In this regard, there is growing interest in characterizing sleep architecture using models of the temporal dynamics of sleep-wake stage transitions. In humans and other mammals, the state transitions defining sleep and wake bout durations have been described with exponential and power law models, respectively. However, sleep-wake stage distributions are often complex, and distinguishing between exponential and power law processes is not always straightforward. Although mono-exponential distributions are distinct from power law distributions, multi-exponential distributions may in fact resemble power laws by appearing linear on a log-log plot. To characterize the parameters that may allow these distributions to mimic one another, we systematically fitted multi-exponential-generated distributions with a power law model, and power law-generated distributions with multi-exponential models. We used the Kolmogorov-Smirnov method to investigate goodness of fit for the "incorrect" model over a range of parameters. The "zone of mimicry" of parameters that increased the risk of mistakenly accepting power law fitting resembled empiric time constants obtained in human sleep and wake bout distributions. Recognizing this uncertainty in model distinction impacts interpretation of transition dynamics (self-organizing versus probabilistic), and the generation of predictive models for clinical classification of normal and pathological sleep architecture.

  7. A Statistical Method for Estimating Missing GHG Emissions in Bottom-Up Inventories: The Case of Fossil Fuel Combustion in Industry in the Bogota Region, Colombia

    NASA Astrophysics Data System (ADS)

    Jimenez-Pizarro, R.; Rojas, A. M.; Pulido-Guio, A. D.

    2012-12-01

    The development of environmentally, socially and financially suitable greenhouse gas (GHG) mitigation portfolios requires detailed disaggregation of emissions by activity sector, preferably at the regional level. Bottom-up (BU) emission inventories are intrinsically disaggregated, but although detailed, they are frequently incomplete. Missing and erroneous activity data are rather common in emission inventories of GHG, criteria and toxic pollutants, even in developed countries. The fraction of missing and erroneous data can be rather large in developing country inventories. In addition, the cost and time for obtaining or correcting this information can be prohibitive or can delay the inventory development. This is particularly true for regional BU inventories in the developing world. Moreover, a rather common practice is to disregard or to arbitrarily impute low default activity or emission values to missing data, which typically leads to significant underestimation of the total emissions. Our investigation focuses on GHG emissions by fossil fuel combustion in industry in the Bogota Region, composed by Bogota and its adjacent, semi-rural area of influence, the Province of Cundinamarca. We found that the BU inventories for this sub-category substantially underestimate emissions when compared to top-down (TD) estimations based on sub-sector specific national fuel consumption data and regional energy intensities. Although both BU inventories have a substantial number of missing and evidently erroneous entries, i.e. information on fuel consumption per combustion unit per company, the validated energy use and emission data display clear and smooth frequency distributions, which can be adequately fitted to bimodal log-normal distributions. This is not unexpected as industrial plant sizes are typically log-normally distributed. Moreover, our statistical tests suggest that industrial sub-sectors, as classified by the International Standard Industrial Classification (ISIC), are also well represented by log-normal distributions. Using the validated data, we tested several missing data estimation procedures, including Montecarlo sampling of the real and fitted distributions, and a per ISIC estimation based on bootstrap-calculated mean values. These results will be presented and discussed in detail. Our results suggest that the accuracy of sub-sector BU emission inventories, particularly in developing regions, could be significantly improved if they are designed and carried out to be representative sub-samples (surveys) of the actual universe of emitters. A large fraction the missing data could be subsequently estimated by robust statistical procedures provided that most of the emitters were accounted by number and ISIC.

  8. Three ancient Montana fluvial systems: Pennsylvanian Tyler, Lower Cretaceous Muddy, and Upper Cretaceous Eagle - their reservoir and source rock distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shepard, B.

    The importance of using Holocene geology as a model in mapping reservoir and source rock distribution is demonstrated in three Montana river-related systems: alluvial valley, barrier bar, and distributary channel-prodelta. The Pennsylvanian Tyler Formation was deposited by a westward-flowing meandering-stream system controlled by an east-west-trending rift valley, and surrounded by backswamp deposits. It is underlain by its probable hydrocarbon source, the marine Mississippian Heath shale and limestone, and overlain locally by the lagoonal Pennsylvanian Bear Gulch Limestone. To date, about 90 million bbl of recoverable oil have been found in Tyler sands. The oil-producing Lower Cretaceous Muddy sandstones in themore » northern Powder River basin are considered to be barrier bars, encased in organic-rich shales, which are most probably the source rock. The Upper Cretaceous Eagle Sandstone in north-central Montana is a distributary channel system, similar to that of the modern Mississippi, which dumped highly carbonaceous materials into an organic-rich delta system. The Eagle now contains possibly enormous amounts of biogenic methane. By using Galveston Island and the modern Mississippi delta as models, in conjunction with employing electric log shapes and porosity logs, it is possible to map ancient fluvial patterns in the study areas. One can then predict the location of possible hydrocarbon accumulations in porous and permeable sand bodies, along with their encasing hydrocarbon source rocks.« less

  9. Nonpoint Source Solute Transport Normal to Aquifer Bedding in Heterogeneous, Markov Chain Random Fields

    NASA Astrophysics Data System (ADS)

    Zhang, H.; Harter, T.; Sivakumar, B.

    2005-12-01

    Facies-based geostatistical models have become important tools for the stochastic analysis of flow and transport processes in heterogeneous aquifers. However, little is known about the dependency of these processes on the parameters of facies- based geostatistical models. This study examines the nonpoint source solute transport normal to the major bedding plane in the presence of interconnected high conductivity (coarse- textured) facies in the aquifer medium and the dependence of the transport behavior upon the parameters of the constitutive facies model. A facies-based Markov chain geostatistical model is used to quantify the spatial variability of the aquifer system hydrostratigraphy. It is integrated with a groundwater flow model and a random walk particle transport model to estimate the solute travel time probability distribution functions (pdfs) for solute flux from the water table to the bottom boundary (production horizon) of the aquifer. The cases examined include, two-, three-, and four-facies models with horizontal to vertical facies mean length anisotropy ratios, ek, from 25:1 to 300:1, and with a wide range of facies volume proportions (e.g, from 5% to 95% coarse textured facies). Predictions of travel time pdfs are found to be significantly affected by the number of hydrostratigraphic facies identified in the aquifer, the proportions of coarse-textured sediments, the mean length of the facies (particularly the ratio of length to thickness of coarse materials), and - to a lesser degree - the juxtapositional preference among the hydrostratigraphic facies. In transport normal to the sedimentary bedding plane, travel time pdfs are not log- normally distributed as is often assumed. Also, macrodispersive behavior (variance of the travel time pdf) was found to not be a unique function of the conductivity variance. The skewness of the travel time pdf varied from negatively skewed to strongly positively skewed within the parameter range examined. We also show that the Markov chain approach may give significantly different travel time pdfs when compared to the more commonly used Gaussian random field approach even though the first and second order moments in the geostatistical distribution of the lnK field are identical. The choice of the appropriate geostatistical model is therefore critical in the assessment of nonpoint source transport.

  10. Simple display system of mechanical properties of cells and their dispersion.

    PubMed

    Shimizu, Yuji; Kihara, Takanori; Haghparast, Seyed Mohammad Ali; Yuba, Shunsuke; Miyake, Jun

    2012-01-01

    The mechanical properties of cells are unique indicators of their states and functions. Though, it is difficult to recognize the degrees of mechanical properties, due to small size of the cell and broad distribution of the mechanical properties. Here, we developed a simple virtual reality system for presenting the mechanical properties of cells and their dispersion using a haptic device and a PC. This system simulates atomic force microscopy (AFM) nanoindentation experiments for floating cells in virtual environments. An operator can virtually position the AFM spherical probe over a round cell with the haptic handle on the PC monitor and feel the force interaction. The Young's modulus of mesenchymal stem cells and HEK293 cells in the floating state was measured by AFM. The distribution of the Young's modulus of these cells was broad, and the distribution complied with a log-normal pattern. To represent the mechanical properties together with the cell variance, we used log-normal distribution-dependent random number determined by the mode and variance values of the Young's modulus of these cells. The represented Young's modulus was determined for each touching event of the probe surface and the cell object, and the haptic device-generating force was calculated using a Hertz model corresponding to the indentation depth and the fixed Young's modulus value. Using this system, we can feel the mechanical properties and their dispersion in each cell type in real time. This system will help us not only recognize the degrees of mechanical properties of diverse cells but also share them with others.

  11. Simple Display System of Mechanical Properties of Cells and Their Dispersion

    PubMed Central

    Shimizu, Yuji; Kihara, Takanori; Haghparast, Seyed Mohammad Ali; Yuba, Shunsuke; Miyake, Jun

    2012-01-01

    The mechanical properties of cells are unique indicators of their states and functions. Though, it is difficult to recognize the degrees of mechanical properties, due to small size of the cell and broad distribution of the mechanical properties. Here, we developed a simple virtual reality system for presenting the mechanical properties of cells and their dispersion using a haptic device and a PC. This system simulates atomic force microscopy (AFM) nanoindentation experiments for floating cells in virtual environments. An operator can virtually position the AFM spherical probe over a round cell with the haptic handle on the PC monitor and feel the force interaction. The Young's modulus of mesenchymal stem cells and HEK293 cells in the floating state was measured by AFM. The distribution of the Young's modulus of these cells was broad, and the distribution complied with a log-normal pattern. To represent the mechanical properties together with the cell variance, we used log-normal distribution-dependent random number determined by the mode and variance values of the Young's modulus of these cells. The represented Young's modulus was determined for each touching event of the probe surface and the cell object, and the haptic device-generating force was calculated using a Hertz model corresponding to the indentation depth and the fixed Young's modulus value. Using this system, we can feel the mechanical properties and their dispersion in each cell type in real time. This system will help us not only recognize the degrees of mechanical properties of diverse cells but also share them with others. PMID:22479595

  12. Potential source identification for aerosol concentrations over a site in Northwestern India

    NASA Astrophysics Data System (ADS)

    Payra, Swagata; Kumar, Pramod; Verma, Sunita; Prakash, Divya; Soni, Manish

    2016-03-01

    The collocated measurements of aerosols size distribution (ASD) and aerosol optical thickness (AOT) are analyzed simultaneously using Grimm aerosol spectrometer and MICROTOP II Sunphotometer over Jaipur, capital of Rajasthan in India. The contrast temperature characteristics during winter and summer seasons of year 2011 are investigated in the present study. The total aerosol number concentration (TANC, 0.3-20 μm) during winter season was observed higher than in summer time and it was dominated by fine aerosol number concentration (FANC < 2 μm). Particles smaller than 0.8 μm (at aerodynamic size) constitute ~ 99% of all particles in winter and ~ 90% of particles in summer season. However, particles greater than 2 μm contribute ~ 3% and ~ 0.2% in summer and winter seasons respectively. The aerosols optical thickness shows nearly similar AOT values during summer and winter but corresponding low Angstrom Exponent (AE) values during summer than winter, respectively. In this work, Potential Source Contribution Function (PSCF) analysis is applied to identify locations of sources that influenced concentrations of aerosols over study area in two different seasons. PSCF analysis shows that the dust particles from Thar Desert contribute significantly to the coarse aerosol number concentration (CANC). Higher values of the PSCF in north from Jaipur showed the industrial areas in northern India to be the likely sources of fine particles. The variation in size distribution of aerosols during two seasons is clearly reflected in the log normal size distribution curves. The log normal size distribution curves reveals that the particle size less than 0.8 μm is the key contributor in winter for higher ANC.

  13. Probabilistic approaches to accounting for data variability in the practical application of bioavailability in predicting aquatic risks from metals.

    PubMed

    Ciffroy, Philippe; Charlatchka, Rayna; Ferreira, Daniel; Marang, Laura

    2013-07-01

    The biotic ligand model (BLM) theoretically enables the derivation of environmental quality standards that are based on true bioavailable fractions of metals. Several physicochemical variables (especially pH, major cations, dissolved organic carbon, and dissolved metal concentrations) must, however, be assigned to run the BLM, but they are highly variable in time and space in natural systems. This article describes probabilistic approaches for integrating such variability during the derivation of risk indexes. To describe each variable using a probability density function (PDF), several methods were combined to 1) treat censored data (i.e., data below the limit of detection), 2) incorporate the uncertainty of the solid-to-liquid partitioning of metals, and 3) detect outliers. From a probabilistic perspective, 2 alternative approaches that are based on log-normal and Γ distributions were tested to estimate the probability of the predicted environmental concentration (PEC) exceeding the predicted non-effect concentration (PNEC), i.e., p(PEC/PNEC>1). The probabilistic approach was tested on 4 real-case studies based on Cu-related data collected from stations on the Loire and Moselle rivers. The approach described in this article is based on BLM tools that are freely available for end-users (i.e., the Bio-Met software) and on accessible statistical data treatments. This approach could be used by stakeholders who are involved in risk assessments of metals for improving site-specific studies. Copyright © 2013 SETAC.

  14. Small-Scale Spatial Fluctuations in the Soft X-Ray Background. Degree awarded by Maryland Univ., 2000

    NASA Technical Reports Server (NTRS)

    Kuntz, K. D.; White, Nicolas E. (Technical Monitor)

    2001-01-01

    In order to isolate the diffuse extragalactic component of the soft X-ray background, we have used a combination of ROSAT All-Sky Survey and IRAS 100 micron data to separate the soft X-ray background into five components. We find a Local Hot Bubble similar to that described by Snowden et al (1998). We make a first calculation of the contribution by unresolved Galactic stars to the diffuse background. We constrain the normalization of the Extragalactic Power Law (the contribution of the unresolved extragalactic point sources such as AGN, QSO'S, and normal galaxies) to 9.5 +/- 0.9 keV/(sq cm s sr keV), assuming a power-law index of 1.46. We show that the remaining emission, which is some combination of Galactic halo emission and the putative diffuse extragalactic emission, must be composed of at least two components which we have characterized by thermal spectra. The softer component has log T - 6.08 and a patchy distribution; thus it is most probably part of the Galactic halo. The harder component has log T - 6.46 and is nearly isotropic; some portion may be due to the Galactic halo and some portion may be due to the diffuse extragalactic emission. The maximum upper limit to the strength of the emission by the diffuse extragalactic component is the total of the hard component, approx. 7.4 +/- 1.0 keV/(sq cm s sr keV) in the 3/4 keV band. We have made the first direct measure of the fluctuations due to the diffuse extragalactic emission in the 3/4 keV band. Physical arguments suggest that small angular scale (approx. 10') fluctuations in the Local Hot Bubble or the Galactic halo will have very short dissipation times (about 10(exp 5) years). Therefore, the fluctuation spectrum of the soft X-ray background should measure the distribution of the diffuse extragalactic emission. Using mosaics of deep, overlapping PSPC pointings, we find an autocorrelation function value of approx. 0.0025 for 10' < theta < 20', and a value consistent with zero on larger scales. Measurement of the fluctuations with a delta I/I method produces consistent results.

  15. Posterior propriety for hierarchical models with log-likelihoods that have norm bounds

    DOE PAGES

    Michalak, Sarah E.; Morris, Carl N.

    2015-07-17

    Statisticians often use improper priors to express ignorance or to provide good frequency properties, requiring that posterior propriety be verified. Our paper addresses generalized linear mixed models, GLMMs, when Level I parameters have Normal distributions, with many commonly-used hyperpriors. It provides easy-to-verify sufficient posterior propriety conditions based on dimensions, matrix ranks, and exponentiated norm bounds, ENBs, for the Level I likelihood. Since many familiar likelihoods have ENBs, which is often verifiable via log-concavity and MLE finiteness, our novel use of ENBs permits unification of posterior propriety results and posterior MGF/moment results for many useful Level I distributions, including those commonlymore » used with multilevel generalized linear models, e.g., GLMMs and hierarchical generalized linear models, HGLMs. Furthermore, those who need to verify existence of posterior distributions or of posterior MGFs/moments for a multilevel generalized linear model given a proper or improper multivariate F prior as in Section 1 should find the required results in Sections 1 and 2 and Theorem 3 (GLMMs), Theorem 4 (HGLMs), or Theorem 5 (posterior MGFs/moments).« less

  16. Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di

    Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less

  17. On the probability distribution function of the mass surface density of molecular clouds. II.

    NASA Astrophysics Data System (ADS)

    Fischera, Jörg

    2014-11-01

    The probability distribution function (PDF) of the mass surface density of molecular clouds provides essential information about the structure of molecular cloud gas and condensed structures out of which stars may form. In general, the PDF shows two basic components: a broad distribution around the maximum with resemblance to a log-normal function, and a tail at high mass surface densities attributed to turbulence and self-gravity. In a previous paper, the PDF of condensed structures has been analyzed and an analytical formula presented based on a truncated radial density profile, ρ(r) = ρc/ (1 + (r/r0)2)n/ 2 with central density ρc and inner radius r0, widely used in astrophysics as a generalization of physical density profiles. In this paper, the results are applied to analyze the PDF of self-gravitating, isothermal, pressurized, spherical (Bonnor-Ebert spheres) and cylindrical condensed structures with emphasis on the dependence of the PDF on the external pressure pext and on the overpressure q-1 = pc/pext, where pc is the central pressure. Apart from individual clouds, we also consider ensembles of spheres or cylinders, where effects caused by a variation of pressure ratio, a distribution of condensed cores within a turbulent gas, and (in case of cylinders) a distribution of inclination angles on the mean PDF are analyzed. The probability distribution of pressure ratios q-1 is assumed to be given by P(q-1) ∝ q-k1/ (1 + (q0/q)γ)(k1 + k2) /γ, where k1, γ, k2, and q0 are fixed parameters. The PDF of individual spheres with overpressures below ~100 is well represented by the PDF of a sphere with an analytical density profile with n = 3. At higher pressure ratios, the PDF at mass surface densities Σ ≪ Σ(0), where Σ(0) is the central mass surface density, asymptotically approaches the PDF of a sphere with n = 2. Consequently, the power-law asymptote at mass surface densities above the peak steepens from Psph(Σ) ∝ Σ-2 to Psph(Σ) ∝ Σ-3. The corresponding asymptote of the PDF of cylinders for the large q-1 is approximately given by Pcyl(Σ) ∝ Σ-4/3(1 - (Σ/Σ(0))2/3)-1/2. The distribution of overpressures q-1 produces a power-law asymptote at high mass surface densities given by ∝ Σ-2k2 - 1 (spheres) or ∝ Σ-2k2 (cylinders). Appendices are available in electronic form at http://www.aanda.org

  18. Weibull mixture regression for marginal inference in zero-heavy continuous outcomes.

    PubMed

    Gebregziabher, Mulugeta; Voronca, Delia; Teklehaimanot, Abeba; Santa Ana, Elizabeth J

    2017-06-01

    Continuous outcomes with preponderance of zero values are ubiquitous in data that arise from biomedical studies, for example studies of addictive disorders. This is known to lead to violation of standard assumptions in parametric inference and enhances the risk of misleading conclusions unless managed properly. Two-part models are commonly used to deal with this problem. However, standard two-part models have limitations with respect to obtaining parameter estimates that have marginal interpretation of covariate effects which are important in many biomedical applications. Recently marginalized two-part models are proposed but their development is limited to log-normal and log-skew-normal distributions. Thus, in this paper, we propose a finite mixture approach, with Weibull mixture regression as a special case, to deal with the problem. We use extensive simulation study to assess the performance of the proposed model in finite samples and to make comparisons with other family of models via statistical information and mean squared error criteria. We demonstrate its application on real data from a randomized controlled trial of addictive disorders. Our results show that a two-component Weibull mixture model is preferred for modeling zero-heavy continuous data when the non-zero part are simulated from Weibull or similar distributions such as Gamma or truncated Gauss.

  19. Rescaled earthquake recurrence time statistics: application to microrepeaters

    NASA Astrophysics Data System (ADS)

    Goltz, Christian; Turcotte, Donald L.; Abaimov, Sergey G.; Nadeau, Robert M.; Uchida, Naoki; Matsuzawa, Toru

    2009-01-01

    Slip on major faults primarily occurs during `characteristic' earthquakes. The recurrence statistics of characteristic earthquakes play an important role in seismic hazard assessment. A major problem in determining applicable statistics is the short sequences of characteristic earthquakes that are available worldwide. In this paper, we introduce a rescaling technique in which sequences can be superimposed to establish larger numbers of data points. We consider the Weibull and log-normal distributions, in both cases we rescale the data using means and standard deviations. We test our approach utilizing sequences of microrepeaters, micro-earthquakes which recur in the same location on a fault. It seems plausible to regard these earthquakes as a miniature version of the classic characteristic earthquakes. Microrepeaters are much more frequent than major earthquakes, leading to longer sequences for analysis. In this paper, we present results for the analysis of recurrence times for several microrepeater sequences from Parkfield, CA as well as NE Japan. We find that, once the respective sequence can be considered to be of sufficient stationarity, the statistics can be well fitted by either a Weibull or a log-normal distribution. We clearly demonstrate this fact by our technique of rescaled combination. We conclude that the recurrence statistics of the microrepeater sequences we consider are similar to the recurrence statistics of characteristic earthquakes on major faults.

  20. Study of sea-surface slope distribution and its effect on radar backscatter based on Global Precipitation Measurement Ku-band precipitation radar measurements

    NASA Astrophysics Data System (ADS)

    Yan, Qiushuang; Zhang, Jie; Fan, Chenqing; Wang, Jing; Meng, Junmin

    2018-01-01

    The collocated normalized radar backscattering cross-section measurements from the Global Precipitation Measurement (GPM) Ku-band precipitation radar (KuPR) and the winds from the moored buoys are used to study the effect of different sea-surface slope probability density functions (PDFs), including the Gaussian PDF, the Gram-Charlier PDF, and the Liu PDF, on the geometrical optics (GO) model predictions of the radar backscatter at low incidence angles (0 deg to 18 deg) at different sea states. First, the peakedness coefficient in the Liu distribution is determined using the collocations at the normal incidence angle, and the results indicate that the peakedness coefficient is a nonlinear function of the wind speed. Then, the performance of the modified Liu distribution, i.e., Liu distribution using the obtained peakedness coefficient estimate; the Gaussian distribution; and the Gram-Charlier distribution is analyzed. The results show that the GO model predictions with the modified Liu distribution agree best with the KuPR measurements, followed by the predictions with the Gaussian distribution, while the predictions with the Gram-Charlier distribution have larger differences as the total or the slick filtered, not the radar filtered, probability density is included in the distribution. The best-performing distribution changes with incidence angle and changes with wind speed.

  1. Estimating the carbon in coarse woody debris with perpendicular distance sampling. Chapter 6

    Treesearch

    Harry T. Valentine; Jeffrey H. Gove; Mark J. Ducey; Timothy G. Gregoire; Michael S. Williams

    2008-01-01

    Perpendicular distance sampling (PDS) is a design for sampling the population of pieces of coarse woody debris (logs) in a forested tract. In application, logs are selected at sample points with probability proportional to volume. Consequently, aggregate log volume per unit land area can be estimated from tallies of logs at sample points. In this chapter we provide...

  2. Mixed and Mixture Regression Models for Continuous Bounded Responses Using the Beta Distribution

    ERIC Educational Resources Information Center

    Verkuilen, Jay; Smithson, Michael

    2012-01-01

    Doubly bounded continuous data are common in the social and behavioral sciences. Examples include judged probabilities, confidence ratings, derived proportions such as percent time on task, and bounded scale scores. Dependent variables of this kind are often difficult to analyze using normal theory models because their distributions may be quite…

  3. Probability and the changing shape of response distributions for orientation.

    PubMed

    Anderson, Britt

    2014-11-18

    Spatial attention and feature-based attention are regarded as two independent mechanisms for biasing the processing of sensory stimuli. Feature attention is held to be a spatially invariant mechanism that advantages a single feature per sensory dimension. In contrast to the prediction of location independence, I found that participants were able to report the orientation of a briefly presented visual grating better for targets defined by high probability conjunctions of features and locations even when orientations and locations were individually uniform. The advantage for high-probability conjunctions was accompanied by changes in the shape of the response distributions. High-probability conjunctions had error distributions that were not normally distributed but demonstrated increased kurtosis. The increase in kurtosis could be explained as a change in the variances of the component tuning functions that comprise a population mixture. By changing the mixture distribution of orientation-tuned neurons, it is possible to change the shape of the discrimination function. This prompts the suggestion that attention may not "increase" the quality of perceptual processing in an absolute sense but rather prioritizes some stimuli over others. This results in an increased number of highly accurate responses to probable targets and, simultaneously, an increase in the number of very inaccurate responses. © 2014 ARVO.

  4. Multilevel mixed effects parametric survival models using adaptive Gauss-Hermite quadrature with application to recurrent events and individual participant data meta-analysis.

    PubMed

    Crowther, Michael J; Look, Maxime P; Riley, Richard D

    2014-09-28

    Multilevel mixed effects survival models are used in the analysis of clustered survival data, such as repeated events, multicenter clinical trials, and individual participant data (IPD) meta-analyses, to investigate heterogeneity in baseline risk and covariate effects. In this paper, we extend parametric frailty models including the exponential, Weibull and Gompertz proportional hazards (PH) models and the log logistic, log normal, and generalized gamma accelerated failure time models to allow any number of normally distributed random effects. Furthermore, we extend the flexible parametric survival model of Royston and Parmar, modeled on the log-cumulative hazard scale using restricted cubic splines, to include random effects while also allowing for non-PH (time-dependent effects). Maximum likelihood is used to estimate the models utilizing adaptive or nonadaptive Gauss-Hermite quadrature. The methods are evaluated through simulation studies representing clinically plausible scenarios of a multicenter trial and IPD meta-analysis, showing good performance of the estimation method. The flexible parametric mixed effects model is illustrated using a dataset of patients with kidney disease and repeated times to infection and an IPD meta-analysis of prognostic factor studies in patients with breast cancer. User-friendly Stata software is provided to implement the methods. Copyright © 2014 John Wiley & Sons, Ltd.

  5. Probability distributions for multimeric systems.

    PubMed

    Albert, Jaroslav; Rooman, Marianne

    2016-01-01

    We propose a fast and accurate method of obtaining the equilibrium mono-modal joint probability distributions for multimeric systems. The method necessitates only two assumptions: the copy number of all species of molecule may be treated as continuous; and, the probability density functions (pdf) are well-approximated by multivariate skew normal distributions (MSND). Starting from the master equation, we convert the problem into a set of equations for the statistical moments which are then expressed in terms of the parameters intrinsic to the MSND. Using an optimization package on Mathematica, we minimize a Euclidian distance function comprising of a sum of the squared difference between the left and the right hand sides of these equations. Comparison of results obtained via our method with those rendered by the Gillespie algorithm demonstrates our method to be highly accurate as well as efficient.

  6. VLBI survey of compact broad absorption line quasars with balnicity index BI = 0

    NASA Astrophysics Data System (ADS)

    Cegłowski, M.; Kunert-Bajraszewska, M.; Roskowiński, C.

    2015-06-01

    We present high-resolution observations, using both the European VLBI Network (EVN) at 1.7 GHz and the Very Long Baseline Array (VLBA) at 5 and 8.4 GHz, to image radio structures of 14 compact sources classified as broad absorption line (BAL) quasars based on the absorption index (AI). All sources but one were resolved, with the majority showing core-jet morphology typical for radio-loud quasars. We discuss in detail the most interesting cases. The high radio luminosities and small linear sizes of the observed objects indicate they are strong young active galactic nuclei. Nevertheless, the distribution of the radio-loudness parameter, log RI, of a larger sample of AI quasars shows that the objects observed by us constitute the most luminous, small subgroup of the AI population. Additionally, we report that for the radio-loudness parameter, the distribution of AI quasars and that for those selected using the traditional balnicity index differ significantly. Strong absorption is connected with lower log RI and thus probably larger viewing angles. Since the AI quasars have on average larger log RI, the orientation can mean that we see them less absorbed. However, we suggest that the orientation is not the only parameter that affects the detected absorption. That the strong absorption is associated with the weak radio emission is equally important and worth exploring.

  7. Studies of Isolated and Non-isolated Photospheric Bright Points in an Active Region Observed by the New Vacuum Solar Telescope

    NASA Astrophysics Data System (ADS)

    Liu, Yanxiao; Xiang, Yongyuan; Erdélyi, Robertus; Liu, Zhong; Li, Dong; Ning, Zongjun; Bi, Yi; Wu, Ning; Lin, Jun

    2018-03-01

    Properties of photospheric bright points (BPs) near an active region have been studied in TiO λ 7058 Å images observed by the New Vacuum Solar Telescope of the Yunnan Observatories. We developed a novel recognition method that was used to identify and track 2010 BPs. The observed evolving BPs are classified into isolated (individual) and non-isolated (where multiple BPs are observed to display splitting and merging behaviors) sets. About 35.1% of BPs are non-isolated. For both isolated and non-isolated BPs, the brightness varies from 0.8 to 1.3 times the average background intensity and follows a Gaussian distribution. The lifetimes of BPs follow a log-normal distribution, with characteristic lifetimes of (267 ± 140) s and (421 ± 255) s, respectively. Their size also follows log-normal distribution, with an average size of about (2.15 ± 0.74) × 104 km2 and (3.00 ± 1.31) × 104 km2 for area, and (163 ± 27) km and (191 ± 40) km for diameter, respectively. Our results indicate that regions with strong background magnetic field have higher BP number density and higher BP area coverage than regions with weak background field. Apparently, the brightness/size of BPs does not depend on the background field. Lifetimes in regions with strong background magnetic field are shorter than those in regions with weak background field, on average.

  8. Statistical Analysis of Hubble /WFC3 Transit Spectroscopy of Extrasolar Planets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fu, Guangwei; Deming, Drake; Knutson, Heather

    2017-10-01

    Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST /WFC3 transit spectra for 1.1–1.65 μ m water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude ofmore » the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.« less

  9. Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets

    NASA Astrophysics Data System (ADS)

    Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan

    2018-01-01

    Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 micron water vapor absorption, and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-Transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.

  10. Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets

    NASA Astrophysics Data System (ADS)

    Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan

    2017-10-01

    Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 μm water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.

  11. On the intrinsic shape of the gamma-ray spectrum for Fermi blazars

    NASA Astrophysics Data System (ADS)

    Kang, Shi-Ju; Wu, Qingwen; Zheng, Yong-Gang; Yin, Yue; Song, Jia-Li; Zou, Hang; Feng, Jian-Chao; Dong, Ai-Jun; Wu, Zhong-Zu; Zhang, Zhi-Bin; Wu, Lin-Hui

    2018-05-01

    The curvature of the γ-ray spectrumin blazarsmay reflect the intrinsic distribution of emitting electrons, which will further give some information on the possible acceleration and cooling processes in the emitting region. The γ-ray spectra of Fermi blazars are normally fitted either by a single power-law (PL) or a log-normal (call Logarithmic Parabola, LP) form. The possible reason for this difference is not clear. We statistically explore this issue based on the different observational properties of 1419 Fermi blazars in the 3LAC Clean Sample.We find that the γ-ray flux (100MeV–100GeV) and variability index follow bimodal distributions for PL and LP blazars, where the γ-ray flux and variability index show a positive correlation. However, the distributions of γ-ray luminosity and redshift follow a unimodal distribution. Our results suggest that the bimodal distribution of γ-ray fluxes for LP and PL blazars may not be intrinsic and all blazars may have an intrinsically curved γ-ray spectrum, and the PL spectrum is just caused by the fitting effect due to less photons.

  12. Role of noise and agents’ convictions on opinion spreading in a three-state voter-like model

    NASA Astrophysics Data System (ADS)

    Crokidakis, Nuno

    2013-07-01

    In this work we study opinion formation in a voter-like model defined on a square lattice of linear size L. The agents may be in three different states, representing any public debate with three choices (yes, no, undecided). We consider heterogeneous agents that have different convictions about their opinions. These convictions limit the capacity of persuasion of the individuals during the interactions. Moreover, there is a noise p that represents the probability of an individual spontaneously changing his opinion to the undecided state. Our simulations suggest that the system reaches stationary states for all values of p, with consensus states occurring only for the noiseless case p = 0. In this case, the relaxation times are distributed according to a log-normal function, with the average value τ growing with the lattice size as τ ∼ Lα, where α ≈ 0.9. We found a threshold value p* ≈ 0.9 above which the stationary fraction of undecided agents is greater than the fraction of decided ones. We also study the consequences of the presence of external effects in the system, which models the influence of mass media on opinion formation.

  13. The shapes of column density PDFs. The importance of the last closed contour

    NASA Astrophysics Data System (ADS)

    Alves, João; Lombardi, Marco; Lada, Charles J.

    2017-10-01

    The probability distribution function of column density (PDF) has become the tool of choice for cloud structure analysis and star formation studies. Its simplicity is attractive, and the PDF could offer access to cloud physical parameters otherwise difficult to measure, but there has been some confusion in the literature on the definition of its completeness limit and shape at the low column density end. In this letter we use the natural definition of the completeness limit of a column density PDF, the last closed column density contour inside a surveyed region, and apply it to a set of large-scale maps of nearby molecular clouds. We conclude that there is no observational evidence for log-normal PDFs in these objects. We find that all studied molecular clouds have PDFs well described by power laws, including the diffuse cloud Polaris. Our results call for a new physical interpretation of the shape of the column density PDFs. We find that the slope of a cloud PDF is invariant to distance but not to the spatial arrangement of cloud material, and as such it is still a useful tool for investigating cloud structure.

  14. Rotations of large inertial cubes, cuboids, cones, and cylinders in turbulence

    NASA Astrophysics Data System (ADS)

    Pujara, Nimish; Oehmke, Theresa B.; Bordoloi, Ankur D.; Variano, Evan A.

    2018-05-01

    We conduct experiments to investigate the rotations of freely moving particles in a homogeneous isotropic turbulent flow. The particles are nearly neutrally buoyant and the particle size exceeds the Kolmogorov scale so that they are too large to be considered passive tracers. Particles of several different shapes are considered including those that break axisymmetry and fore-aft symmetry. We find that regardless of shape the mean-square particle angular velocity scales as deq -4 /3, where de q is the equivalent diameter of a volume-matched sphere. This scaling behavior is consistent with the notion that velocity differences across a length de q in the flow are responsible for particle rotation. We also find that the probability density functions (PDFs) of particle angular velocity collapse for particles of different shapes and similar de q. The significance of these results is that the rotations of an inertial, nonspherical particle are only functions of its volume and not its shape. The magnitude of particle angular velocity appears log-normally distributed and individual Cartesian components show long tails. With increasing de q, the tails of the PDF become less pronounced, meaning that extreme events of angular velocity become less common for larger particles.

  15. Influence of soil pH on the sorption of ionizable chemicals: modeling advances.

    PubMed

    Franco, Antonio; Fu, Wenjing; Trapp, Stefan

    2009-03-01

    The soil-water distribution coefficient of ionizable chemicals (K(d)) depends on the soil acidity, mainly because the pH governs speciation. Using pH-specific K(d) values normalized to organic carbon (K(OC)) from the literature, a method was developed to estimate the K(OC) of monovalent organic acids and bases. The regression considers pH-dependent speciation and species-specific partition coefficients, calculated from the dissociation constant (pK(a)) and the octanol-water partition coefficient of the neutral molecule (log P(n)). Probably because of the lower pH near the organic colloid-water interface, the optimal pH to model dissociation was lower than the bulk soil pH. The knowledge of the soil pH allows calculation of the fractions of neutral and ionic molecules in the system, thus improving the existing regression for acids. The same approach was not successful with bases, for which the impact of pH on the total sorption is contrasting. In fact, the shortcomings of the model assumptions affect the predictive power for acids and for bases differently. We evaluated accuracy and limitations of the regressions for their use in the environmental fate assessment of ionizable chemicals.

  16. Energy loss and inelastic diffraction of fast atoms at grazing incidence

    NASA Astrophysics Data System (ADS)

    Roncin, Philippe; Debiossac, Maxime; Oueslati, Hanene; Raouafi, Fayçal

    2018-07-01

    The diffraction of fast atoms at grazing incidence on crystal surfaces (GIFAD) was first interpreted only in terms of elastic diffraction from a perfectly periodic rigid surface with atoms fixed at equilibrium positions. Recently, a new approach has been proposed, referred here as the quantum binary collision model (QBCM). The QBCM takes into account both the elastic and inelastic momentum transfers via the Lamb-Dicke probability. It suggests that the shape of the inelastic diffraction profiles are log-normal distributions with a variance proportional to the nuclear energy loss deposited on the surface. For keV Neon atoms impinging on a LiF(0 0 1) surface under an incidence angle θ , the predictions of the QBCM in its analytic version are compared with numerical trajectory simulations. Some of the assumptions such as the planar continuous form, the possibility to neglect the role of lithium atoms and the influence of temperature are investigated. A specific energy loss dependence ΔE ∝θ7 is identified in the quasi-elastic regime merging progressively to the classical onset ΔE ∝θ3 . The ratio of these two predictions highlights the role of quantum effects in the energy loss.

  17. A mechanism producing power law etc. distributions

    NASA Astrophysics Data System (ADS)

    Li, Heling; Shen, Hongjun; Yang, Bin

    2017-07-01

    Power law distribution is playing an increasingly important role in the complex system study. Based on the insolvability of complex systems, the idea of incomplete statistics is utilized and expanded, three different exponential factors are introduced in equations about the normalization condition, statistical average and Shannon entropy, with probability distribution function deduced about exponential function, power function and the product form between power function and exponential function derived from Shannon entropy and maximal entropy principle. So it is shown that maximum entropy principle can totally replace equal probability hypothesis. Owing to the fact that power and probability distribution in the product form between power function and exponential function, which cannot be derived via equal probability hypothesis, can be derived by the aid of maximal entropy principle, it also can be concluded that maximal entropy principle is a basic principle which embodies concepts more extensively and reveals basic principles on motion laws of objects more fundamentally. At the same time, this principle also reveals the intrinsic link between Nature and different objects in human society and principles complied by all.

  18. Scaling laws and properties of compositional data

    NASA Astrophysics Data System (ADS)

    Buccianti, Antonella; Albanese, Stefano; Lima, AnnaMaria; Minolfi, Giulia; De Vivo, Benedetto

    2016-04-01

    Many random processes occur in geochemistry. Accurate predictions of the manner in which elements or chemical species interact each other are needed to construct models able to treat presence of random components. Geochemical variables actually observed are the consequence of several events, some of which may be poorly defined or imperfectly understood. Variables tend to change with time/space but, despite their complexity, may share specific common traits and it is possible to model them stochastically. Description of the frequency distribution of the geochemical abundances has been an important target of research, attracting attention for at least 100 years, starting with CLARKE (1889) and continued by GOLDSCHMIDT (1933) and WEDEPOHL (1955). However, it was AHRENS (1954a,b) who focussed on the effect of skewness distributions, for example the log-normal distribution, regarded by him as a fundamental law of geochemistry. Although modeling of frequency distributions with some probabilistic models (for example Gaussian, log-normal, Pareto) has been well discussed in several fields of application, little attention has been devoted to the features of compositional data. When compositional nature of data is taken into account, the most typical distribution models for compositions are the Dirichlet and the additive logistic normal (or normal on the simplex) (AITCHISON et al. 2003; MATEU-FIGUERAS et al. 2005; MATEU-FIGUERAS and PAWLOWSKY-GLAHN 2008; MATEU-FIGUERAS et al. 2013). As an alternative, because compositional data have to be transformed from simplex space to real space, coordinates obtained by the ilr transformation or by application of the concept of balance can be analyzed by classical methods (EGOZCUE et al. 2003). In this contribution an approach coherent with the properties of compositional information is proposed and used to investigate the shape of the frequency distribution of compositional data. The purpose is to understand data-generation processes from the perspective of compositional theory. The approach is based on the use of the isometric log-ratio transformation, characterized by theoretical and practical advantages, but requiring a more complex geochemical interpretation compared with the investigation of single variables. The proposed methodology directs attention to model the frequency distributions of more complex indices, linking all the terms of the composition to better represent the dynamics of geochemical processes. An example of its application is presented and discussed by considering topsoil geochemistry of Campania Region (southern Italy). The investigated multi-element data archive contains, among others, Al, As, B, Ba, Ca, Co, Cr, Cu, Fe, K, La, Mg, Mn, Mo, Na, Ni, P, Pb, Sr, Th, Ti, V and Zn (mg/kg) contents determined in 3535 new topsoils as well as information on coordinates, geology, land cover. (BUCCIANTI et al., 2015). AHRENS, L. ,1954a. Geochim. Cosm. Acta 6, 121-131. AHRENS, L., 1954b. Geochim. Cosm. Acta 5, 49-73. AITCHISON, J., et al., 2003. Math Geol 35(6), 667-680. BUCCIANTI et al., 2015. Jour. Geoch. Explor., 159, 302-316. CLARKE, F., 1889. Phil. Society of Washington Bull. 11, 131-142. EGOZCUE, J.J. et al., 2003. Math Geol 35(3), 279-300. MATEU-FIGUERAS, G. et al, (2005), Stoch. Environ. Res. Risk Ass. 19(3), 205-214.

  19. Does Litter Size Variation Affect Models of Terrestrial Carnivore Extinction Risk and Management?

    PubMed Central

    Devenish-Nelson, Eleanor S.; Stephens, Philip A.; Harris, Stephen; Soulsbury, Carl; Richards, Shane A.

    2013-01-01

    Background Individual variation in both survival and reproduction has the potential to influence extinction risk. Especially for rare or threatened species, reliable population models should adequately incorporate demographic uncertainty. Here, we focus on an important form of demographic stochasticity: variation in litter sizes. We use terrestrial carnivores as an example taxon, as they are frequently threatened or of economic importance. Since data on intraspecific litter size variation are often sparse, it is unclear what probability distribution should be used to describe the pattern of litter size variation for multiparous carnivores. Methodology/Principal Findings We used litter size data on 32 terrestrial carnivore species to test the fit of 12 probability distributions. The influence of these distributions on quasi-extinction probabilities and the probability of successful disease control was then examined for three canid species – the island fox Urocyon littoralis, the red fox Vulpes vulpes, and the African wild dog Lycaon pictus. Best fitting probability distributions differed among the carnivores examined. However, the discretised normal distribution provided the best fit for the majority of species, because variation among litter-sizes was often small. Importantly, however, the outcomes of demographic models were generally robust to the distribution used. Conclusion/Significance These results provide reassurance for those using demographic modelling for the management of less studied carnivores in which litter size variation is estimated using data from species with similar reproductive attributes. PMID:23469140

  20. Does litter size variation affect models of terrestrial carnivore extinction risk and management?

    PubMed

    Devenish-Nelson, Eleanor S; Stephens, Philip A; Harris, Stephen; Soulsbury, Carl; Richards, Shane A

    2013-01-01

    Individual variation in both survival and reproduction has the potential to influence extinction risk. Especially for rare or threatened species, reliable population models should adequately incorporate demographic uncertainty. Here, we focus on an important form of demographic stochasticity: variation in litter sizes. We use terrestrial carnivores as an example taxon, as they are frequently threatened or of economic importance. Since data on intraspecific litter size variation are often sparse, it is unclear what probability distribution should be used to describe the pattern of litter size variation for multiparous carnivores. We used litter size data on 32 terrestrial carnivore species to test the fit of 12 probability distributions. The influence of these distributions on quasi-extinction probabilities and the probability of successful disease control was then examined for three canid species - the island fox Urocyon littoralis, the red fox Vulpes vulpes, and the African wild dog Lycaon pictus. Best fitting probability distributions differed among the carnivores examined. However, the discretised normal distribution provided the best fit for the majority of species, because variation among litter-sizes was often small. Importantly, however, the outcomes of demographic models were generally robust to the distribution used. These results provide reassurance for those using demographic modelling for the management of less studied carnivores in which litter size variation is estimated using data from species with similar reproductive attributes.

  1. TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS

    PubMed Central

    Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.

    2017-01-01

    Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971

  2. The Statistical Value of Raw Fluorescence Signal in Luminex xMAP Based Multiplex Immunoassays

    PubMed Central

    Breen, Edmond J.; Tan, Woei; Khan, Alamgir

    2016-01-01

    Tissue samples (plasma, saliva, serum or urine) from 169 patients classified as either normal or having one of seven possible diseases are analysed across three 96-well plates for the presences of 37 analytes using cytokine inflammation multiplexed immunoassay panels. Censoring for concentration data caused problems for analysis of the low abundant analytes. Using fluorescence analysis over concentration based analysis allowed analysis of these low abundant analytes. Mixed-effects analysis on the resulting fluorescence and concentration responses reveals a combination of censoring and mapping the fluorescence responses to concentration values, through a 5PL curve, changed observed analyte concentrations. Simulation verifies this, by showing a dependence on the mean florescence response and its distribution on the observed analyte concentration levels. Differences from normality, in the fluorescence responses, can lead to differences in concentration estimates and unreliable probabilities for treatment effects. It is seen that when fluorescence responses are normally distributed, probabilities of treatment effects for fluorescence based t-tests has greater statistical power than the same probabilities from concentration based t-tests. We add evidence that the fluorescence response, unlike concentration values, doesn’t require censoring and we show with respect to differential analysis on the fluorescence responses that background correction is not required. PMID:27243383

  3. Measuring firm size distribution with semi-nonparametric densities

    NASA Astrophysics Data System (ADS)

    Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier

    2017-11-01

    In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.

  4. Bayesian soft X-ray tomography using non-stationary Gaussian Processes

    NASA Astrophysics Data System (ADS)

    Li, Dong; Svensson, J.; Thomsen, H.; Medina, F.; Werner, A.; Wolf, R.

    2013-08-01

    In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.

  5. Bayesian soft X-ray tomography using non-stationary Gaussian Processes.

    PubMed

    Li, Dong; Svensson, J; Thomsen, H; Medina, F; Werner, A; Wolf, R

    2013-08-01

    In this study, a Bayesian based non-stationary Gaussian Process (GP) method for the inference of soft X-ray emissivity distribution along with its associated uncertainties has been developed. For the investigation of equilibrium condition and fast magnetohydrodynamic behaviors in nuclear fusion plasmas, it is of importance to infer, especially in the plasma center, spatially resolved soft X-ray profiles from a limited number of noisy line integral measurements. For this ill-posed inversion problem, Bayesian probability theory can provide a posterior probability distribution over all possible solutions under given model assumptions. Specifically, the use of a non-stationary GP to model the emission allows the model to adapt to the varying length scales of the underlying diffusion process. In contrast to other conventional methods, the prior regularization is realized in a probability form which enhances the capability of uncertainty analysis, in consequence, scientists who concern the reliability of their results will benefit from it. Under the assumption of normally distributed noise, the posterior distribution evaluated at a discrete number of points becomes a multivariate normal distribution whose mean and covariance are analytically available, making inversions and calculation of uncertainty fast. Additionally, the hyper-parameters embedded in the model assumption can be optimized through a Bayesian Occam's Razor formalism and thereby automatically adjust the model complexity. This method is shown to produce convincing reconstructions and good agreements with independently calculated results from the Maximum Entropy and Equilibrium-Based Iterative Tomography Algorithm methods.

  6. Analysis of the Hessian for Inverse Scattering Problems. Part 3. Inverse Medium Scattering of Electromagnetic Waves in Three Dimensions

    DTIC Science & Technology

    2012-08-01

    small data noise and model error, the discrete Hessian can be approximated by a low-rank matrix. This in turn enables fast solution of an appropriately...implication of the compactness of the Hessian is that for small data noise and model error, the discrete Hessian can be approximated by a low-rank matrix. This...probability distribution is given by the inverse of the Hessian of the negative log likelihood function. For Gaussian data noise and model error, this

  7. Empirical relationships between tree fall and landscape-level amounts of logging and fire

    PubMed Central

    Blanchard, Wade; Blair, David; McBurney, Lachlan; Stein, John; Banks, Sam C.

    2018-01-01

    Large old trees are critically important keystone structures in forest ecosystems globally. Populations of these trees are also in rapid decline in many forest ecosystems, making it important to quantify the factors that influence their dynamics at different spatial scales. Large old trees often occur in forest landscapes also subject to fire and logging. However, the effects on the risk of collapse of large old trees of the amount of logging and fire in the surrounding landscape are not well understood. Using an 18-year study in the Mountain Ash (Eucalyptus regnans) forests of the Central Highlands of Victoria, we quantify relationships between the probability of collapse of large old hollow-bearing trees at a site and the amount of logging and the amount of fire in the surrounding landscape. We found the probability of collapse increased with an increasing amount of logged forest in the surrounding landscape. It also increased with a greater amount of burned area in the surrounding landscape, particularly for trees in highly advanced stages of decay. The most likely explanation for elevated tree fall with an increasing amount of logged or burned areas in the surrounding landscape is change in wind movement patterns associated with cutblocks or burned areas. Previous studies show that large old hollow-bearing trees are already at high risk of collapse in our study area. New analyses presented here indicate that additional logging operations in the surrounding landscape will further elevate that risk. Current logging prescriptions require the protection of large old hollow-bearing trees on cutblocks. We suggest that efforts to reduce the probability of collapse of large old hollow-bearing trees on unlogged sites will demand careful landscape planning to limit the amount of timber harvesting in the surrounding landscape. PMID:29474487

  8. Empirical relationships between tree fall and landscape-level amounts of logging and fire.

    PubMed

    Lindenmayer, David B; Blanchard, Wade; Blair, David; McBurney, Lachlan; Stein, John; Banks, Sam C

    2018-01-01

    Large old trees are critically important keystone structures in forest ecosystems globally. Populations of these trees are also in rapid decline in many forest ecosystems, making it important to quantify the factors that influence their dynamics at different spatial scales. Large old trees often occur in forest landscapes also subject to fire and logging. However, the effects on the risk of collapse of large old trees of the amount of logging and fire in the surrounding landscape are not well understood. Using an 18-year study in the Mountain Ash (Eucalyptus regnans) forests of the Central Highlands of Victoria, we quantify relationships between the probability of collapse of large old hollow-bearing trees at a site and the amount of logging and the amount of fire in the surrounding landscape. We found the probability of collapse increased with an increasing amount of logged forest in the surrounding landscape. It also increased with a greater amount of burned area in the surrounding landscape, particularly for trees in highly advanced stages of decay. The most likely explanation for elevated tree fall with an increasing amount of logged or burned areas in the surrounding landscape is change in wind movement patterns associated with cutblocks or burned areas. Previous studies show that large old hollow-bearing trees are already at high risk of collapse in our study area. New analyses presented here indicate that additional logging operations in the surrounding landscape will further elevate that risk. Current logging prescriptions require the protection of large old hollow-bearing trees on cutblocks. We suggest that efforts to reduce the probability of collapse of large old hollow-bearing trees on unlogged sites will demand careful landscape planning to limit the amount of timber harvesting in the surrounding landscape.

  9. Statistical Analysis of an Infrared Thermography Inspection of Reinforced Carbon-Carbon

    NASA Technical Reports Server (NTRS)

    Comeaux, Kayla

    2011-01-01

    Each piece of flight hardware being used on the shuttle must be analyzed and pass NASA requirements before the shuttle is ready for launch. One tool used to detect cracks that lie within flight hardware is Infrared Flash Thermography. This is a non-destructive testing technique which uses an intense flash of light to heat up the surface of a material after which an Infrared camera is used to record the cooling of the material. Since cracks within the material obstruct the natural heat flow through the material, they are visible when viewing the data from the Infrared camera. We used Ecotherm, a software program, to collect data pertaining to the delaminations and analyzed the data using Ecotherm and University of Dayton Log Logistic Probability of Detection (POD) Software. The goal was to reproduce the statistical analysis produced by the University of Dayton software, by using scatter plots, log transforms, and residuals to test the assumption of normality for the residuals.

  10. Normal Tissue Complication Probability (NTCP) modeling of late rectal bleeding following external beam radiotherapy for prostate cancer: A Test of the QUANTEC-recommended NTCP model.

    PubMed

    Liu, Mitchell; Moiseenko, Vitali; Agranovich, Alexander; Karvat, Anand; Kwan, Winkle; Saleh, Ziad H; Apte, Aditya A; Deasy, Joseph O

    2010-10-01

    Validating a predictive model for late rectal bleeding following external beam treatment for prostate cancer would enable safer treatments or dose escalation. We tested the normal tissue complication probability (NTCP) model recommended in the recent QUANTEC review (quantitative analysis of normal tissue effects in the clinic). One hundred and sixty one prostate cancer patients were treated with 3D conformal radiotherapy for prostate cancer at the British Columbia Cancer Agency in a prospective protocol. The total prescription dose for all patients was 74 Gy, delivered in 2 Gy/fraction. 159 3D treatment planning datasets were available for analysis. Rectal dose volume histograms were extracted and fitted to a Lyman-Kutcher-Burman NTCP model. Late rectal bleeding (>grade 2) was observed in 12/159 patients (7.5%). Multivariate logistic regression with dose-volume parameters (V50, V60, V70, etc.) was non-significant. Among clinical variables, only age was significant on a Kaplan-Meier log-rank test (p=0.007, with an optimal cut point of 77 years). Best-fit Lyman-Kutcher-Burman model parameters (with 95% confidence intervals) were: n = 0.068 (0.01, +infinity); m =0.14 (0.0, 0.86); and TD50 = 81 (27, 136) Gy. The peak values fall within the 95% QUANTEC confidence intervals. On this dataset, both models had only modest ability to predict complications: the best-fit model had a Spearman's rank correlation coefficient of rs = 0.099 (p = 0.11) and area under the receiver operating characteristic curve (AUC) of 0.62; the QUANTEC model had rs=0.096 (p= 0.11) and a corresponding AUC of 0.61. Although the QUANTEC model consistently predicted higher NTCP values, it could not be rejected according to the χ(2) test (p = 0.44). Observed complications, and best-fit parameter estimates, were consistent with the QUANTEC-preferred NTCP model. However, predictive power was low, at least partly because the rectal dose distribution characteristics do not vary greatly within this patient cohort.

  11. Normality of raw data in general linear models: The most widespread myth in statistics

    USGS Publications Warehouse

    Kery, Marc; Hatfield, Jeff S.

    2003-01-01

    In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.

  12. The distribution pattern of Lutzomyia longipalpis (Diptera: Psychodidae) in the peridomiciles of a sector with canine and human visceral leishmaniasis transmission in the municipality of Dracena, São Paulo, Brazil.

    PubMed

    Rangel, Osias; Sampaio, Susy Mary Perpetuo; Ciaravolo, Ricardo Mario de Carvalho; Holcman, Marcia Moreira

    2012-03-01

    The specimen distribution pattern of a species can be used to characterise a population of interest and also provides area-specific guidance for pest management and control. In the municipality of Dracena, in the state of São Paulo, we analysed 5,889 Lutzomyia longipalpis specimens collected from the peridomiciles of 14 houses in a sector where American visceral leishmaniasis (AVL) is transmitted to humans and dogs. The goal was to analyse the dispersion and a theoretical fitting of the species occurrence probability. From January-December 2005, samples were collected once per week using CDC light traps that operated for 12-h periods. Each collection was considered a sub-sample and was evaluated monthly. The standardised Morisita index was used as a measure of dispersion. Adherence tests were performed for the log-series distribution. The number of traps was used to adjust the octave plots. The quantity of Lu. longipalpis in the sector was highly aggregated for each month of the year, adhering to a log-series distribution for 11 of the 12 months analysed. A sex-stratified analysis demonstrated a pattern of aggregated dispersion adjusted for each month of the year. The classes and frequencies of the traps in octaves can be employed as indicators for entomological surveillance and AVL control.

  13. Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

    PubMed

    Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

    2017-01-01

    If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.

  14. heterogeneous mixture distributions for multi-source extreme rainfall

    NASA Astrophysics Data System (ADS)

    Ouarda, T.; Shin, J.; Lee, T. S.

    2013-12-01

    Mixture distributions have been used to model hydro-meteorological variables showing mixture distributional characteristics, e.g. bimodality. Homogeneous mixture (HOM) distributions (e.g. Normal-Normal and Gumbel-Gumbel) have been traditionally applied to hydro-meteorological variables. However, there is no reason to restrict the mixture distribution as the combination of one identical type. It might be beneficial to characterize the statistical behavior of hydro-meteorological variables from the application of heterogeneous mixture (HTM) distributions such as Normal-Gamma. In the present work, we focus on assessing the suitability of HTM distributions for the frequency analysis of hydro-meteorological variables. In the present work, in order to estimate the parameters of HTM distributions, the meta-heuristic algorithm (Genetic Algorithm) is employed to maximize the likelihood function. In the present study, a number of distributions are compared, including the Gamma-Extreme value type-one (EV1) HTM distribution, the EV1-EV1 HOM distribution, and EV1 distribution. The proposed distribution models are applied to the annual maximum precipitation data in South Korea. The Akaike Information Criterion (AIC), the root mean squared errors (RMSE) and the log-likelihood are used as measures of goodness-of-fit of the tested distributions. Results indicate that the HTM distribution (Gamma-EV1) presents the best fitness. The HTM distribution shows significant improvement in the estimation of quantiles corresponding to the 20-year return period. It is shown that extreme rainfall in the coastal region of South Korea presents strong heterogeneous mixture distributional characteristics. Results indicate that HTM distributions are a good alternative for the frequency analysis of hydro-meteorological variables when disparate statistical characteristics are presented.

  15. Canopy Spectral Invariants. Part 2; Application to Classification of Forest Types from Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Schull, M. A.; Knyazikhin, Y.; Xu, L.; Samanta, A.; Carmona, P. L.; Lepine, L.; Jenkins, J. P.; Ganguly, S.; Myneni, R. B.

    2011-01-01

    Many studies have been conducted to demonstrate the ability of hyperspectral data to discriminate plant dominant species. Most of them have employed the use of empirically based techniques, which are site specific, requires some initial training based on characteristics of known leaf and/or canopy spectra and therefore may not be extendable to operational use or adapted to changing or unknown land cover. In this paper we propose a physically based approach for separation of dominant forest type using hyperspectral data. The radiative transfer theory of canopy spectral invariants underlies the approach, which facilitates parameterization of the canopy reflectance in terms of the leaf spectral scattering and two spectrally invariant and structurally varying variables - recollision and directional escape probabilities. The methodology is based on the idea of retrieving spectrally invariant parameters from hyperspectral data first, and then relating their values to structural characteristics of three-dimensional canopy structure. Theoretical and empirical analyses of ground and airborne data acquired by Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) over two sites in New England, USA, suggest that the canopy spectral invariants convey information about canopy structure at both the macro- and micro-scales. The total escape probability (one minus recollision probability) varies as a power function with the exponent related to the number of nested hierarchical levels present in the pixel. Its base is a geometrical mean of the local total escape probabilities and accounts for the cumulative effect of canopy structure over a wide range of scales. The ratio of the directional to the total escape probability becomes independent of the number of hierarchical levels and is a function of the canopy structure at the macro-scale such as tree spatial distribution, crown shape and size, within-crown foliage density and ground cover. These properties allow for the natural separation of dominant forest classes based on the location of points on the total escape probability vs the ratio log-log plane.

  16. Characterizing the topology of probabilistic biological networks.

    PubMed

    Todor, Andrei; Dobra, Alin; Kahveci, Tamer

    2013-01-01

    Biological interactions are often uncertain events, that may or may not take place with some probability. This uncertainty leads to a massive number of alternative interaction topologies for each such network. The existing studies analyze the degree distribution of biological networks by assuming that all the given interactions take place under all circumstances. This strong and often incorrect assumption can lead to misleading results. In this paper, we address this problem and develop a sound mathematical basis to characterize networks in the presence of uncertain interactions. Using our mathematical representation, we develop a method that can accurately describe the degree distribution of such networks. We also take one more step and extend our method to accurately compute the joint-degree distributions of node pairs connected by edges. The number of possible network topologies grows exponentially with the number of uncertain interactions. However, the mathematical model we develop allows us to compute these degree distributions in polynomial time in the number of interactions. Our method works quickly even for entire protein-protein interaction (PPI) networks. It also helps us find an adequate mathematical model using MLE. We perform a comparative study of node-degree and joint-degree distributions in two types of biological networks: the classical deterministic networks and the more flexible probabilistic networks. Our results confirm that power-law and log-normal models best describe degree distributions for both probabilistic and deterministic networks. Moreover, the inverse correlation of degrees of neighboring nodes shows that, in probabilistic networks, nodes with large number of interactions prefer to interact with those with small number of interactions more frequently than expected. We also show that probabilistic networks are more robust for node-degree distribution computation than the deterministic ones. all the data sets used, the software implemented and the alignments found in this paper are available at http://bioinformatics.cise.ufl.edu/projects/probNet/.

  17. Distribution of Active and Resting Periods in the Motor Activity of Patients with Depression and Schizophrenia

    PubMed Central

    Hauge, Erik; Berle, Jan Øystein; Dilsaver, Steven; Oedegaard, Ketil J.

    2016-01-01

    Objective Alterations of activity are prominent features of the major functional psychiatric disorders. Motor activity patterns are characterized by bursts of activity separated by periods with inactivity. The purpose of the present study has been to analyze such active and inactive periods in patients with depression and schizophrenia. Methods Actigraph registrations for 12 days from 24 patients with schizophrenia, 23 with depression and 29 healthy controls. Results Patients with schizophrenia and depression have distinctly different profiles with regard to the characterization and distribution of active and inactive periods. The mean duration of active periods is lowest in the depressed patients, and the duration of inactive periods is highest in the patients with schizophrenia. For active periods the cumulative probability distribution, using lengths from 1 to 35 min, follows a straight line on a log-log plot, suggestive of a power law function, and a similar relationship is found for inactive periods, using lengths from 1 to 20 min. For both active and inactive periods the scaling exponent is higher in the depressed compared to the schizophrenic patients. Conclusion The present findings add to previously published results, with other mathematical methods, suggesting there are important differences in control systems regulating motor behavior in these two major groups of psychiatric disorders. PMID:26766953

  18. Identification of personalized dysregulated pathways in hepatocellular carcinoma.

    PubMed

    Li, Hong; Jiang, Xiumei; Zhu, Shengjie; Sui, Lihong

    2017-04-01

    Hepatocellular carcinoma (HCC) is the most common liver malignancy, and ranks the fifth most prevalent malignant tumors worldwide. In general, HCC are detected until the disease is at an advanced stage and may miss the best chance for treatment. Thus, elucidating the molecular mechanisms is critical to clinical diagnosis and treatment for HCC. The purpose of this study was to identify dysregulated pathways of great potential functional relevance in the progression of HCC. Microarray data of 72 pairs of tumor and matched non-tumor surrounding tissues of HCC were transformed to gene expression data. Differentially expressed genes (DEG) between patients and normal controls were identified using Linear Models for Microarray Analysis. Personalized dysregulated pathways were identified using individualized pathway aberrance score module. 169 differentially expressed genes (DEG) were obtained with |logFC|≥1.5 and P≤0.01. 749 dysregulated pathways were obtained with P≤0.01 in pathway statistics, and there were 93 DEG overlapped in the dysregulated pathways. After performing normal distribution analysis, 302 pathways with the aberrance probability≥0.5 were identified. By ranking pathway with aberrance probability, the top 20 pathways were obtained. Only three DEGs (TUBA1C, TPR, CDC20) were involved in the top 20 pathways. These personalized dysregulated pathways and overlapped genes may give new insights into the underlying biological mechanisms in the progression of HCC. Particular attention can be focused on them for further research. Copyright © 2017 Elsevier GmbH. All rights reserved.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dana Kelly; Kurt Vedros; Robert Youngblood

    This paper examines false indication probabilities in the context of the Mitigating System Performance Index (MSPI), in order to investigate the pros and cons of different approaches to resolving two coupled issues: (1) sensitivity to the prior distribution used in calculating the Bayesian-corrected unreliability contribution to the MSPI, and (2) whether (in a particular plant configuration) to model the fuel oil transfer pump (FOTP) as a separate component, or integrally to its emergency diesel generator (EDG). False indication probabilities were calculated for the following situations: (1) all component reliability parameters at their baseline values, so that the true indication ismore » green, meaning that an indication of white or above would be false positive; (2) one or more components degraded to the extent that the true indication would be (mid) white, and “false” would be green (negative) or yellow (negative) or red (negative). In key respects, this was the approach taken in NUREG-1753. The prior distributions examined were the constrained noninformative (CNI) prior used currently by the MSPI, a mixture of conjugate priors, the Jeffreys noninformative prior, a nonconjugate log(istic)-normal prior, and the minimally informative prior investigated in (Kelly et al., 2010). The mid-white performance state was set at ?CDF = ?10 ? 10-6/yr. For each simulated time history, a check is made of whether the calculated ?CDF is above or below 10-6/yr. If the parameters were at their baseline values, and ?CDF > 10-6/yr, this is counted as a false positive. Conversely, if one or all of the parameters are set to values corresponding to ?CDF > 10-6/yr but that time history’s ?CDF < 10-6/yr, this is counted as a false negative indication. The false indication (positive or negative) probability is then estimated as the number of false positive or negative counts divided by the number of time histories (100,000). Results are presented for a set of base case parameter values, and three sensitivity cases in which the number of FOTP demands was reduced, along with the Birnbaum importance of the FOTP.« less

  20. The rates and time-delay distribution of multiply imaged supernovae behind lensing clusters

    NASA Astrophysics Data System (ADS)

    Li, Xue; Hjorth, Jens; Richard, Johan

    2012-11-01

    Time delays of gravitationally lensed sources can be used to constrain the mass model of a deflector and determine cosmological parameters. We here present an analysis of the time-delay distribution of multiply imaged sources behind 17 strong lensing galaxy clusters with well-calibrated mass models. We find that for time delays less than 1000 days, at z = 3.0, their logarithmic probability distribution functions are well represented by P(log Δt) = 5.3 × 10-4Δttilde beta/M2502tilde beta, with tilde beta = 0.77, where M250 is the projected cluster mass inside 250 kpc (in 1014M⊙), and tilde beta is the power-law slope of the distribution. The resultant probability distribution function enables us to estimate the time-delay distribution in a lensing cluster of known mass. For a cluster with M250 = 2 × 1014M⊙, the fraction of time delays less than 1000 days is approximately 3%. Taking Abell 1689 as an example, its dark halo and brightest galaxies, with central velocity dispersions σ>=500kms-1, mainly produce large time delays, while galaxy-scale mass clumps are responsible for generating smaller time delays. We estimate the probability of observing multiple images of a supernova in the known images of Abell 1689. A two-component model of estimating the supernova rate is applied in this work. For a magnitude threshold of mAB = 26.5, the yearly rate of Type Ia (core-collapse) supernovae with time delays less than 1000 days is 0.004±0.002 (0.029±0.001). If the magnitude threshold is lowered to mAB ~ 27.0, the rate of core-collapse supernovae suitable for time delay observation is 0.044±0.015 per year.

  1. Bayesian seismic inversion based on rock-physics prior modeling for the joint estimation of acoustic impedance, porosity and lithofacies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Passos de Figueiredo, Leandro, E-mail: leandrop.fgr@gmail.com; Grana, Dario; Santos, Marcio

    We propose a Bayesian approach for seismic inversion to estimate acoustic impedance, porosity and lithofacies within the reservoir conditioned to post-stack seismic and well data. The link between elastic and petrophysical properties is given by a joint prior distribution for the logarithm of impedance and porosity, based on a rock-physics model. The well conditioning is performed through a background model obtained by well log interpolation. Two different approaches are presented: in the first approach, the prior is defined by a single Gaussian distribution, whereas in the second approach it is defined by a Gaussian mixture to represent the well datamore » multimodal distribution and link the Gaussian components to different geological lithofacies. The forward model is based on a linearized convolutional model. For the single Gaussian case, we obtain an analytical expression for the posterior distribution, resulting in a fast algorithm to compute the solution of the inverse problem, i.e. the posterior distribution of acoustic impedance and porosity as well as the facies probability given the observed data. For the Gaussian mixture prior, it is not possible to obtain the distributions analytically, hence we propose a Gibbs algorithm to perform the posterior sampling and obtain several reservoir model realizations, allowing an uncertainty analysis of the estimated properties and lithofacies. Both methodologies are applied to a real seismic dataset with three wells to obtain 3D models of acoustic impedance, porosity and lithofacies. The methodologies are validated through a blind well test and compared to a standard Bayesian inversion approach. Using the probability of the reservoir lithofacies, we also compute a 3D isosurface probability model of the main oil reservoir in the studied field.« less

  2. On the entropy function in sociotechnical systems

    PubMed Central

    Montroll, Elliott W.

    1981-01-01

    The entropy function H = -Σpj log pj (pj being the probability of a system being in state j) and its continuum analogue H = ∫p(x) log p(x) dx are fundamental in Shannon's theory of information transfer in communication systems. It is here shown that the discrete form of H also appears naturally in single-lane traffic flow theory. In merchandising, goods flow from a whole-saler through a retailer to a customer. Certain features of the process may be deduced from price distribution functions derived from Sears Roebuck and Company catalogues. It is found that the dispersion in logarithm of catalogue prices of a given year has remained about constant, independently of the year, for over 75 years. From this it may be inferred that the continuum entropy function for the variable logarithm of price had inadvertently, through Sears Roebuck policies, been maximized for that firm subject to the observed dispersion. PMID:16593136

  3. On the entropy function in sociotechnical systems.

    PubMed

    Montroll, E W

    1981-12-01

    The entropy function H = -Sigmap(j) log p(j) (p(j) being the probability of a system being in state j) and its continuum analogue H = integralp(x) log p(x) dx are fundamental in Shannon's theory of information transfer in communication systems. It is here shown that the discrete form of H also appears naturally in single-lane traffic flow theory. In merchandising, goods flow from a whole-saler through a retailer to a customer. Certain features of the process may be deduced from price distribution functions derived from Sears Roebuck and Company catalogues. It is found that the dispersion in logarithm of catalogue prices of a given year has remained about constant, independently of the year, for over 75 years. From this it may be inferred that the continuum entropy function for the variable logarithm of price had inadvertently, through Sears Roebuck policies, been maximized for that firm subject to the observed dispersion.

  4. A DDDAS Framework for Volcanic Ash Propagation and Hazard Analysis

    DTIC Science & Technology

    2012-01-01

    probability distribution for the input variables (for example, Hermite polynomials for normally distributed parameters, or Legendre for uniformly...parameters and windfields will drive our simulations. We will use uncertainty quantification methodology – polynomial chaos quadrature in combination...quantification methodology ? polynomial chaos quadrature in combination with data integration to complete the DDDAS loop. 15. SUBJECT TERMS 16. SECURITY

  5. C-5A Cargo Deck Low-Frequency Vibration Environment

    DTIC Science & Technology

    1975-02-01

    SAMPLE VIBRATION CALCULATIONS 13 1. Normal Distribution 13 2. Binomial Distribution 15 IV CONCLUSIONS 17 -! V REFERENCES 18 t: FEiCENDIJJ PAGS 2LANKNOT...Calculation for Binomial Distribution 108 (Vertical Acceleration, Right Rear Cargo Deck) xi I. INTRODUCTION The availability of large transport...the end of taxi. These peaks could then be used directly to compile the probability of occurrence of specific values of acceleration using the binomial

  6. A reversible-jump Markov chain Monte Carlo algorithm for 1D inversion of magnetotelluric data

    NASA Astrophysics Data System (ADS)

    Mandolesi, Eric; Ogaya, Xenia; Campanyà, Joan; Piana Agostinetti, Nicola

    2018-04-01

    This paper presents a new computer code developed to solve the 1D magnetotelluric (MT) inverse problem using a Bayesian trans-dimensional Markov chain Monte Carlo algorithm. MT data are sensitive to the depth-distribution of rock electric conductivity (or its reciprocal, resistivity). The solution provided is a probability distribution - the so-called posterior probability distribution (PPD) for the conductivity at depth, together with the PPD of the interface depths. The PPD is sampled via a reversible-jump Markov Chain Monte Carlo (rjMcMC) algorithm, using a modified Metropolis-Hastings (MH) rule to accept or discard candidate models along the chains. As the optimal parameterization for the inversion process is generally unknown a trans-dimensional approach is used to allow the dataset itself to indicate the most probable number of parameters needed to sample the PPD. The algorithm is tested against two simulated datasets and a set of MT data acquired in the Clare Basin (County Clare, Ireland). For the simulated datasets the correct number of conductive layers at depth and the associated electrical conductivity values is retrieved, together with reasonable estimates of the uncertainties on the investigated parameters. Results from the inversion of field measurements are compared with results obtained using a deterministic method and with well-log data from a nearby borehole. The PPD is in good agreement with the well-log data, showing as a main structure a high conductive layer associated with the Clare Shale formation. In this study, we demonstrate that our new code go beyond algorithms developend using a linear inversion scheme, as it can be used: (1) to by-pass the subjective choices in the 1D parameterizations, i.e. the number of horizontal layers in the 1D parameterization, and (2) to estimate realistic uncertainties on the retrieved parameters. The algorithm is implemented using a simple MPI approach, where independent chains run on isolated CPU, to take full advantage of parallel computer architectures. In case of a large number of data, a master/slave appoach can be used, where the master CPU samples the parameter space and the slave CPUs compute forward solutions.

  7. 2MASS wide-field extinction maps. V. Corona Australis

    NASA Astrophysics Data System (ADS)

    Alves, João; Lombardi, Marco; Lada, Charles J.

    2014-05-01

    We present a near-infrared extinction map of a large region (~870 deg2) covering the isolated Corona Australis complex of molecular clouds. We reach a 1-σ error of 0.02 mag in the K-band extinction with a resolution of 3 arcmin over the entire map. We find that the Corona Australis cloud is about three times as large as revealed by previous CO and dust emission surveys. The cloud consists of a 45 pc long complex of filamentary structure from the well known star forming Western-end (the head, N ≥ 1023 cm-2) to the diffuse Eastern-end (the tail, N ≤ 1021 cm-2). Remarkably, about two thirds of the complex both in size and mass lie beneath AV ~ 1 mag. We find that the probability density function (PDF) of the cloud cannot be described by a single log-normal function. Similar to prior studies, we found a significant excess at high column densities, but a log-normal + power-law tail fit does not work well at low column densities. We show that at low column densities near the peak of the observed PDF, both the amplitude and shape of the PDF are dominated by noise in the extinction measurements making it impractical to derive the intrinsic cloud PDF below AK < 0.15 mag. Above AK ~ 0.15 mag, essentially the molecular component of the cloud, the PDF appears to be best described by a power-law with index -3, but could also described as the tail of a broad and relatively low amplitude, log-normal PDF that peaks at very low column densities. FITS files of the extinction maps are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/565/A18

  8. Derivation of an eigenvalue probability density function relating to the Poincaré disk

    NASA Astrophysics Data System (ADS)

    Forrester, Peter J.; Krishnapur, Manjunath

    2009-09-01

    A result of Zyczkowski and Sommers (2000 J. Phys. A: Math. Gen. 33 2045-57) gives the eigenvalue probability density function for the top N × N sub-block of a Haar distributed matrix from U(N + n). In the case n >= N, we rederive this result, starting from knowledge of the distribution of the sub-blocks, introducing the Schur decomposition and integrating over all variables except the eigenvalues. The integration is done by identifying a recursive structure which reduces the dimension. This approach is inspired by an analogous approach which has been recently applied to determine the eigenvalue probability density function for random matrices A-1B, where A and B are random matrices with entries standard complex normals. We relate the eigenvalue distribution of the sub-blocks to a many-body quantum state, and to the one-component plasma, on the pseudosphere.

  9. Estimation of transition probabilities of credit ratings

    NASA Astrophysics Data System (ADS)

    Peng, Gan Chew; Hin, Pooi Ah

    2015-12-01

    The present research is based on the quarterly credit ratings of ten companies over 15 years taken from the database of the Taiwan Economic Journal. The components in the vector mi (mi1, mi2,⋯, mi10) may first be used to denote the credit ratings of the ten companies in the i-th quarter. The vector mi+1 in the next quarter is modelled to be dependent on the vector mi via a conditional distribution which is derived from a 20-dimensional power-normal mixture distribution. The transition probability Pkl (i ,j ) for getting mi+1,j = l given that mi, j = k is then computed from the conditional distribution. It is found that the variation of the transition probability Pkl (i ,j ) as i varies is able to give indication for the possible transition of the credit rating of the j-th company in the near future.

  10. Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.

    PubMed

    Bengtsson, Henrik; Hössjer, Ola

    2006-03-01

    Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.

  11. Workload Characterization and Performance Implications of Large-Scale Blog Servers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeon, Myeongjae; Kim, Youngjae; Hwang, Jeaho

    With the ever-increasing popularity of social network services (SNSs), an understanding of the characteristics of these services and their effects on the behavior of their host servers is critical. However, there has been a lack of research on the workload characterization of servers running SNS applications such as blog services. To fill this void, we empirically characterized real-world web server logs collected from one of the largest South Korean blog hosting sites for 12 consecutive days. The logs consist of more than 96 million HTTP requests and 4.7 TB of network traffic. Our analysis reveals the followings: (i) The transfermore » size of non-multimedia files and blog articles can be modeled using a truncated Pareto distribution and a log-normal distribution, respectively; (ii) User access for blog articles does not show temporal locality, but is strongly biased towards those posted with image or audio files. We additionally discuss the potential performance improvement through clustering of small files on a blog page into contiguous disk blocks, which benefits from the observed file access patterns. Trace-driven simulations show that, on average, the suggested approach achieves 60.6% better system throughput and reduces the processing time for file access by 30.8% compared to the best performance of the Ext4 file system.« less

  12. The decline and fall of Type II error rates

    Treesearch

    Steve Verrill; Mark Durst

    2005-01-01

    For general linear models with normally distributed random errors, the probability of a Type II error decreases exponentially as a function of sample size. This potentially rapid decline reemphasizes the importance of performing power calculations.

  13. Comparative study of nonlinear properties of EEG signals of normal persons and epileptic patients

    PubMed Central

    2009-01-01

    Background Investigation of the functioning of the brain in living systems has been a major effort amongst scientists and medical practitioners. Amongst the various disorder of the brain, epilepsy has drawn the most attention because this disorder can affect the quality of life of a person. In this paper we have reinvestigated the EEGs for normal and epileptic patients using surrogate analysis, probability distribution function and Hurst exponent. Results Using random shuffled surrogate analysis, we have obtained some of the nonlinear features that was obtained by Andrzejak et al. [Phys Rev E 2001, 64:061907], for the epileptic patients during seizure. Probability distribution function shows that the activity of an epileptic brain is nongaussian in nature. Hurst exponent has been shown to be useful to characterize a normal and an epileptic brain and it shows that the epileptic brain is long term anticorrelated whereas, the normal brain is more or less stochastic. Among all the techniques, used here, Hurst exponent is found very useful for characterization different cases. Conclusion In this article, differences in characteristics for normal subjects with eyes open and closed, epileptic subjects during seizure and seizure free intervals have been shown mainly using Hurst exponent. The H shows that the brain activity of a normal man is uncorrelated in nature whereas, epileptic brain activity shows long range anticorrelation. PMID:19619290

  14. IMPLEMENTING A NOVEL CYCLIC CO2 FLOOD IN PALEOZOIC REEFS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    James R. Wood; W. Quinlan; A. Wylie

    2003-07-01

    Recycled CO2 will be used in this demonstration project to produce bypassed oil from the Silurian Charlton 6 pinnacle reef (Otsego County) in the Michigan Basin. Contract negotiations by our industry partner to gain access to this CO2 that would otherwise be vented to the atmosphere are near completion. A new method of subsurface characterization, log curve amplitude slicing, is being used to map facies distributions and reservoir properties in two reefs, the Belle River Mills and Chester 18 Fields. The Belle River Mills and Chester18 fields are being used as typefields because they have excellent log-curve and core datamore » coverage. Amplitude slicing of the normalized gamma ray curves is showing trends that may indicate significant heterogeneity and compartmentalization in these reservoirs. Digital and hard copy data continues to be compiled for the Niagaran reefs in the Michigan Basin. Technology transfer took place through technical presentations regarding the log curve amplitude slicing technique and a booth at the Midwest PTTC meeting.« less

  15. Daily magnesium intake and serum magnesium concentration among Japanese people.

    PubMed

    Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori

    2008-01-01

    The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. The mean (+/-standard deviation) daily magnesium intake was 322 (+/-132), 323 (+/-163), and 322 (+/-147) mg/day for men, women, and the entire group, respectively. The mean (+/-standard deviation) serum magnesium concentration was 20.69 (+/-2.83), 20.69 (+/-2.88), and 20.69 (+/-2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log(10)X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed.

  16. A Bayesian Hybrid Adaptive Randomisation Design for Clinical Trials with Survival Outcomes.

    PubMed

    Moatti, M; Chevret, S; Zohar, S; Rosenberger, W F

    2016-01-01

    Response-adaptive randomisation designs have been proposed to improve the efficiency of phase III randomised clinical trials and improve the outcomes of the clinical trial population. In the setting of failure time outcomes, Zhang and Rosenberger (2007) developed a response-adaptive randomisation approach that targets an optimal allocation, based on a fixed sample size. The aim of this research is to propose a response-adaptive randomisation procedure for survival trials with an interim monitoring plan, based on the following optimal criterion: for fixed variance of the estimated log hazard ratio, what allocation minimizes the expected hazard of failure? We demonstrate the utility of the design by redesigning a clinical trial on multiple myeloma. To handle continuous monitoring of data, we propose a Bayesian response-adaptive randomisation procedure, where the log hazard ratio is the effect measure of interest. Combining the prior with the normal likelihood, the mean posterior estimate of the log hazard ratio allows derivation of the optimal target allocation. We perform a simulation study to assess and compare the performance of this proposed Bayesian hybrid adaptive design to those of fixed, sequential or adaptive - either frequentist or fully Bayesian - designs. Non informative normal priors of the log hazard ratio were used, as well as mixture of enthusiastic and skeptical priors. Stopping rules based on the posterior distribution of the log hazard ratio were computed. The method is then illustrated by redesigning a phase III randomised clinical trial of chemotherapy in patients with multiple myeloma, with mixture of normal priors elicited from experts. As expected, there was a reduction in the proportion of observed deaths in the adaptive vs. non-adaptive designs; this reduction was maximized using a Bayes mixture prior, with no clear-cut improvement by using a fully Bayesian procedure. The use of stopping rules allows a slight decrease in the observed proportion of deaths under the alternate hypothesis compared with the adaptive designs with no stopping rules. Such Bayesian hybrid adaptive survival trials may be promising alternatives to traditional designs, reducing the duration of survival trials, as well as optimizing the ethical concerns for patients enrolled in the trial.

  17. Morphometric analyses of hominoid crania, probabilities of conspecificity and an approximation of a biological species constant.

    PubMed

    Thackeray, J F; Dykes, S

    2016-02-01

    Thackeray has previously explored the possibility of using a morphometric approach to quantify the "amount" of variation within species and to assess probabilities of conspecificity when two fossil specimens are compared, instead of "pigeon-holing" them into discrete species. In an attempt to obtain a statistical (probabilistic) definition of a species, Thackeray has recognized an approximation of a biological species constant (T=-1.61) based on the log-transformed standard error of the coefficient m (log sem) in regression analysis of cranial and other data from pairs of specimens of conspecific extant species, associated with regression equations of the form y=mx+c where m is the slope and c is the intercept, using measurements of any specimen A (x axis), and any specimen B of the same species (y axis). The log-transformed standard error of the co-efficient m (log sem) is a measure of the degree of similarity between pairs of specimens, and in this study shows central tendency around a mean value of -1.61 and standard deviation 0.10 for modern conspecific specimens. In this paper we focus attention on the need to take into account the range of difference in log sem values (Δlog sem or "delta log sem") obtained from comparisons when specimen A (x axis) is compared to B (y axis), and secondly when specimen A (y axis) is compared to B (x axis). Thackeray's approach can be refined to focus on high probabilities of conspecificity for pairs of specimens for which log sem is less than -1.61 and for which Δlog sem is less than 0.03. We appeal for the adoption of a concept here called "sigma taxonomy" (as opposed to "alpha taxonomy"), recognizing that boundaries between species are not always well defined. Copyright © 2015 Elsevier GmbH. All rights reserved.

  18. On the distribution of a product of N Gaussian random variables

    NASA Astrophysics Data System (ADS)

    Stojanac, Željka; Suess, Daniel; Kliesch, Martin

    2017-08-01

    The product of Gaussian random variables appears naturally in many applications in probability theory and statistics. It has been known that the distribution of a product of N such variables can be expressed in terms of a Meijer G-function. Here, we compute a similar representation for the corresponding cumulative distribution function (CDF) and provide a power-log series expansion of the CDF based on the theory of the more general Fox H-functions. Numerical computations show that for small values of the argument the CDF of products of Gaussians is well approximated by the lowest orders of this expansion. Analogous results are also shown for the absolute value as well as the square of such products of N Gaussian random variables. For the latter two settings, we also compute the moment generating functions in terms of Meijer G-functions.

  19. Analysis of the Factors Affecting the Interval between Blood Donations Using Log-Normal Hazard Model with Gamma Correlated Frailties.

    PubMed

    Tavakol, Najmeh; Kheiri, Soleiman; Sedehi, Morteza

    2016-01-01

    Time to donating blood plays a major role in a regular donor to becoming continues one. The aim of this study was to determine the effective factors on the interval between the blood donations. In a longitudinal study in 2008, 864 samples of first-time donors in Shahrekord Blood Transfusion Center,  capital city of Chaharmahal and Bakhtiari Province, Iran were selected by a systematic sampling and were followed up for five years. Among these samples, a subset of 424 donors who had at least two successful blood donations were chosen for this study and the time intervals between their donations were measured as response variable. Sex, body weight, age, marital status, education, stay and job were recorded as independent variables. Data analysis was performed based on log-normal hazard model with gamma correlated frailty. In this model, the frailties are sum of two independent components assumed a gamma distribution. The analysis was done via Bayesian approach using Markov Chain Monte Carlo algorithm by OpenBUGS. Convergence was checked via Gelman-Rubin criteria using BOA program in R. Age, job and education were significant on chance to donate blood (P<0.05). The chances of blood donation for the higher-aged donors, clericals, workers, free job, students and educated donors were higher and in return, time intervals between their blood donations were shorter. Due to the significance effect of some variables in the log-normal correlated frailty model, it is necessary to plan educational and cultural program to encourage the people with longer inter-donation intervals to donate more frequently.

  20. Analytical approximations for effective relative permeability in the capillary limit

    NASA Astrophysics Data System (ADS)

    Rabinovich, Avinoam; Li, Boxiao; Durlofsky, Louis J.

    2016-10-01

    We present an analytical method for calculating two-phase effective relative permeability, krjeff, where j designates phase (here CO2 and water), under steady state and capillary-limit assumptions. These effective relative permeabilities may be applied in experimental settings and for upscaling in the context of numerical flow simulations, e.g., for CO2 storage. An exact solution for effective absolute permeability, keff, in two-dimensional log-normally distributed isotropic permeability (k) fields is the geometric mean. We show that this does not hold for krjeff since log normality is not maintained in the capillary-limit phase permeability field (Kj=k·krj) when capillary pressure, and thus the saturation field, is varied. Nevertheless, the geometric mean is still shown to be suitable for approximating krjeff when the variance of ln⁡k is low. For high-variance cases, we apply a correction to the geometric average gas effective relative permeability using a Winsorized mean, which neglects large and small Kj values symmetrically. The analytical method is extended to anisotropically correlated log-normal permeability fields using power law averaging. In these cases, the Winsorized mean treatment is applied to the gas curves for cases described by negative power law exponents (flow across incomplete layers). The accuracy of our analytical expressions for krjeff is demonstrated through extensive numerical tests, using low-variance and high-variance permeability realizations with a range of correlation structures. We also present integral expressions for geometric-mean and power law average krjeff for the systems considered, which enable derivation of closed-form series solutions for krjeff without generating permeability realizations.

Top