Haeckel, Rainer; Wosniok, Werner
2010-10-01
The distribution of many quantities in laboratory medicine are considered to be Gaussian if they are symmetric, although, theoretically, a Gaussian distribution is not plausible for quantities that can attain only non-negative values. If a distribution is skewed, further specification of the type is required, which may be difficult to provide. Skewed (non-Gaussian) distributions found in clinical chemistry usually show only moderately large positive skewness (e.g., log-normal- and χ(2) distribution). The degree of skewness depends on the magnitude of the empirical biological variation (CV(e)), as demonstrated using the log-normal distribution. A Gaussian distribution with a small CV(e) (e.g., for plasma sodium) is very similar to a log-normal distribution with the same CV(e). In contrast, a relatively large CV(e) (e.g., plasma aspartate aminotransferase) leads to distinct differences between a Gaussian and a log-normal distribution. If the type of an empirical distribution is unknown, it is proposed that a log-normal distribution be assumed in such cases. This avoids distributional assumptions that are not plausible and does not contradict the observation that distributions with small biological variation look very similar to a Gaussian distribution.
NASA Astrophysics Data System (ADS)
Fukami, Christine S.; Sullivan, Amy P.; Ryan Fulgham, S.; Murschell, Trey; Borch, Thomas; Smith, James N.; Farmer, Delphine K.
2016-07-01
Particle-into-Liquid Samplers (PILS) have become a standard aerosol collection technique, and are widely used in both ground and aircraft measurements in conjunction with off-line ion chromatography (IC) measurements. Accurate and precise background samples are essential to account for gas-phase components not efficiently removed and any interference in the instrument lines, collection vials or off-line analysis procedures. For aircraft sampling with PILS, backgrounds are typically taken with in-line filters to remove particles prior to sample collection once or twice per flight with more numerous backgrounds taken on the ground. Here, we use data collected during the Front Range Air Pollution and Photochemistry Éxperiment (FRAPPÉ) to demonstrate that not only are multiple background filter samples are essential to attain a representative background, but that the chemical background signals do not follow the Gaussian statistics typically assumed. Instead, the background signals for all chemical components analyzed from 137 background samples (taken from ∼78 total sampling hours over 18 flights) follow a log-normal distribution, meaning that the typical approaches of averaging background samples and/or assuming a Gaussian distribution cause an over-estimation of background samples - and thus an underestimation of sample concentrations. Our approach of deriving backgrounds from the peak of the log-normal distribution results in detection limits of 0.25, 0.32, 3.9, 0.17, 0.75 and 0.57 μg m-3 for sub-micron aerosol nitrate (NO3-), nitrite (NO2-), ammonium (NH4+), sulfate (SO42-), potassium (K+) and calcium (Ca2+), respectively. The difference in backgrounds calculated from assuming a Gaussian distribution versus a log-normal distribution were most extreme for NH4+, resulting in a background that was 1.58× that determined from fitting a log-normal distribution.
Shen, Meiyu; Russek-Cohen, Estelle; Slud, Eric V
2016-08-12
Bioequivalence (BE) studies are an essential part of the evaluation of generic drugs. The most common in vivo BE study design is the two-period two-treatment crossover design. AUC (area under the concentration-time curve) and Cmax (maximum concentration) are obtained from the observed concentration-time profiles for each subject from each treatment under each sequence. In the BE evaluation of pharmacokinetic crossover studies, the normality of the univariate response variable, e.g. log(AUC) 1 or log(Cmax), is often assumed in the literature without much evidence. Therefore, we investigate the distributional assumption of the normality of response variables, log(AUC) and log(Cmax), by simulating concentration-time profiles from two-stage pharmacokinetic models (commonly used in pharmacokinetic research) for a wide range of pharmacokinetic parameters and measurement error structures. Our simulations show that, under reasonable distributional assumptions on the pharmacokinetic parameters, log(AUC) has heavy tails and log(Cmax) is skewed. Sensitivity analyses are conducted to investigate how the distribution of the standardized log(AUC) (or the standardized log(Cmax)) for a large number of simulated subjects deviates from normality if distributions of errors in the pharmacokinetic model for plasma concentrations deviate from normality and if the plasma concentration can be described by different compartmental models.
Distribution of transvascular pathway sizes through the pulmonary microvascular barrier.
McNamee, J E
1987-01-01
Mathematical models of solute and water exchange in the lung have been helpful in understanding factors governing the volume flow rate and composition of pulmonary lymph. As experimental data and models become more encompassing, parameter identification becomes more difficult. Pore sizes in these models should approach and eventually become equivalent to actual physiological pathway sizes as more complex and accurate models are tried. However, pore sizes and numbers vary from model to model as new pathway sizes are added. This apparent inconsistency of pore sizes can be explained if it is assumed that the pulmonary blood-lymph barrier is widely heteroporous, for example, being composed of a continuous distribution of pathway sizes. The sieving characteristics of the pulmonary barrier are reproduced by a log normal distribution of pathway sizes (log mean = -0.20, log s.d. = 1.05). A log normal distribution of pathways in the microvascular barrier is shown to follow from a rather general assumption about the nature of the pulmonary endothelial junction.
Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items.
ERIC Educational Resources Information Center
van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.
2002-01-01
Compared the nominal and empirical null distributions of the standardized log-likelihood statistic for polytomous items for paper-and-pencil (P&P) and computerized adaptive tests (CATs). Results show that the empirical distribution of the statistic differed from the assumed standard normal distribution for both P&P tests and CATs. Also…
Davis, Joe M
2011-10-28
General equations are derived for the distribution of minimum resolution between two chromatographic peaks, when peak heights in a multi-component chromatogram follow a continuous statistical distribution. The derivation draws on published theory by relating the area under the distribution of minimum resolution to the area under the distribution of the ratio of peak heights, which in turn is derived from the peak-height distribution. Two procedures are proposed for the equations' numerical solution. The procedures are applied to the log-normal distribution, which recently was reported to describe the distribution of component concentrations in three complex natural mixtures. For published statistical parameters of these mixtures, the distribution of minimum resolution is similar to that for the commonly assumed exponential distribution of peak heights used in statistical-overlap theory. However, these two distributions of minimum resolution can differ markedly, depending on the scale parameter of the log-normal distribution. Theory for the computation of the distribution of minimum resolution is extended to other cases of interest. With the log-normal distribution of peak heights as an example, the distribution of minimum resolution is computed when small peaks are lost due to noise or detection limits, and when the height of at least one peak is less than an upper limit. The distribution of minimum resolution shifts slightly to lower resolution values in the first case and to markedly larger resolution values in the second one. The theory and numerical procedure are confirmed by Monte Carlo simulation. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yamazaki, Dai G.; Ichiki, Kiyotomo; Takahashi, Keitaro
2011-12-01
We study the effect of primordial magnetic fields (PMFs) on the anisotropies of the cosmic microwave background (CMB). We assume the spectrum of PMFs is described by log-normal distribution which has a characteristic scale, rather than power-law spectrum. This scale is expected to reflect the generation mechanisms and our analysis is complementary to previous studies with power-law spectrum. We calculate power spectra of energy density and Lorentz force of the log-normal PMFs, and then calculate CMB temperature and polarization angular power spectra from scalar, vector, and tensor modes of perturbations generated from such PMFs. By comparing these spectra with WMAP7, QUaD, CBI, Boomerang, and ACBAR data sets, we find that the current CMB data set places the strongest constraint at k≃10-2.5Mpc-1 with the upper limit B≲3nG.
Box-Cox transformation of firm size data in statistical analysis
NASA Astrophysics Data System (ADS)
Chen, Ting Ting; Takaishi, Tetsuya
2014-03-01
Firm size data usually do not show the normality that is often assumed in statistical analysis such as regression analysis. In this study we focus on two firm size data: the number of employees and sale. Those data deviate considerably from a normal distribution. To improve the normality of those data we transform them by the Box-Cox transformation with appropriate parameters. The Box-Cox transformation parameters are determined so that the transformed data best show the kurtosis of a normal distribution. It is found that the two firm size data transformed by the Box-Cox transformation show strong linearity. This indicates that the number of employees and sale have the similar property as a firm size indicator. The Box-Cox parameters obtained for the firm size data are found to be very close to zero. In this case the Box-Cox transformations are approximately a log-transformation. This suggests that the firm size data we used are approximately log-normal distributions.
Stochastic Modeling Approach to the Incubation Time of Prionic Diseases
NASA Astrophysics Data System (ADS)
Ferreira, A. S.; da Silva, M. A.; Cressoni, J. C.
2003-05-01
Transmissible spongiform encephalopathies are neurodegenerative diseases for which prions are the attributed pathogenic agents. A widely accepted theory assumes that prion replication is due to a direct interaction between the pathologic (PrPSc) form and the host-encoded (PrPC) conformation, in a kind of autocatalytic process. Here we show that the overall features of the incubation time of prion diseases are readily obtained if the prion reaction is described by a simple mean-field model. An analytical expression for the incubation time distribution then follows by associating the rate constant to a stochastic variable log normally distributed. The incubation time distribution is then also shown to be log normal and fits the observed BSE (bovine spongiform encephalopathy) data very well. Computer simulation results also yield the correct BSE incubation time distribution at low PrPC densities.
Log-Normality and Multifractal Analysis of Flame Surface Statistics
NASA Astrophysics Data System (ADS)
Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.
2013-11-01
The turbulent flame surface is typically highly wrinkled and folded at a multitude of scales controlled by various flame properties. It is useful if the information contained in this complex geometry can be projected onto a simpler regular geometry for the use of spectral, wavelet or multifractal analyses. Here we investigate local flame surface statistics of turbulent flame expanding under constant pressure. First the statistics of local length ratio is experimentally obtained from high-speed Mie scattering images. For spherically expanding flame, length ratio on the measurement plane, at predefined equiangular sectors is defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at corresponding area-ratio pdfs. Both the pdfs are found to be near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis. Currently at Indian Institute of Science, India.
M-dwarf exoplanet surface density distribution. A log-normal fit from 0.07 to 400 AU
NASA Astrophysics Data System (ADS)
Meyer, Michael R.; Amara, Adam; Reggiani, Maddalena; Quanz, Sascha P.
2018-04-01
Aims: We fit a log-normal function to the M-dwarf orbital surface density distribution of gas giant planets, over the mass range 1-10 times that of Jupiter, from 0.07 to 400 AU. Methods: We used a Markov chain Monte Carlo approach to explore the likelihoods of various parameter values consistent with point estimates of the data given our assumed functional form. Results: This fit is consistent with radial velocity, microlensing, and direct-imaging observations, is well-motivated from theoretical and phenomenological points of view, and predicts results of future surveys. We present probability distributions for each parameter and a maximum likelihood estimate solution. Conclusions: We suggest that this function makes more physical sense than other widely used functions, and we explore the implications of our results on the design of future exoplanet surveys.
Flame surface statistics of constant-pressure turbulent expanding premixed flames
NASA Astrophysics Data System (ADS)
Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.
2014-04-01
In this paper we investigate the local flame surface statistics of constant-pressure turbulent expanding flames. First the statistics of local length ratio is experimentally determined from high-speed planar Mie scattering images of spherically expanding flames, with the length ratio on the measurement plane, at predefined equiangular sectors, defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we then convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at the corresponding area-ratio pdfs. It is found that both the length ratio and area ratio pdfs are near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis.
Ejected Particle Size Distributions from Shocked Metal Surfaces
Schauer, M. M.; Buttler, W. T.; Frayer, D. K.; ...
2017-04-12
Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.
Ejected Particle Size Distributions from Shocked Metal Surfaces
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schauer, M. M.; Buttler, W. T.; Frayer, D. K.
Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.
Applying the log-normal distribution to target detection
NASA Astrophysics Data System (ADS)
Holst, Gerald C.
1992-09-01
Holst and Pickard experimentally determined that MRT responses tend to follow a log-normal distribution. The log normal distribution appeared reasonable because nearly all visual psychological data is plotted on a logarithmic scale. It has the additional advantage that it is bounded to positive values; an important consideration since probability of detection is often plotted in linear coordinates. Review of published data suggests that the log-normal distribution may have universal applicability. Specifically, the log-normal distribution obtained from MRT tests appears to fit the target transfer function and the probability of detection of rectangular targets.
Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study.
Liu, Lei; Strawderman, Robert L; Johnson, Bankole A; O'Quigley, John M
2016-02-01
Two-part random effects models (Olsen and Schafer,(1) Tooze et al.(2)) have been applied to repeated measures of semi-continuous data, characterized by a mixture of a substantial proportion of zero values and a skewed distribution of positive values. In the original formulation of this model, the natural logarithm of the positive values is assumed to follow a normal distribution with a constant variance parameter. In this article, we review and consider three extensions of this model, allowing the positive values to follow (a) a generalized gamma distribution, (b) a log-skew-normal distribution, and (c) a normal distribution after the Box-Cox transformation. We allow for the possibility of heteroscedasticity. Maximum likelihood estimation is shown to be conveniently implemented in SAS Proc NLMIXED. The performance of the methods is compared through applications to daily drinking records in a secondary data analysis from a randomized controlled trial of topiramate for alcohol dependence treatment. We find that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity. We also compare the three models by the likelihood ratio tests for non-nested hypotheses (Vuong(3)). The results suggest that the generalized gamma distribution provides the best fit, though no statistically significant differences are found in pairwise model comparisons. © The Author(s) 2012.
Universal Distribution of Litter Decay Rates
NASA Astrophysics Data System (ADS)
Forney, D. C.; Rothman, D. H.
2008-12-01
Degradation of litter is the result of many physical, chemical and biological processes. The high variability of these processes likely accounts for the progressive slowdown of decay with litter age. This age dependence is commonly thought to result from the superposition of processes with different decay rates k. Here we assume an underlying continuous yet unknown distribution p(k) of decay rates [1]. To seek its form, we analyze the mass-time history of 70 LIDET [2] litter data sets obtained under widely varying conditions. We construct a regularized inversion procedure to find the best fitting distribution p(k) with the least degrees of freedom. We find that the resulting p(k) is universally consistent with a lognormal distribution, i.e.~a Gaussian distribution of log k, characterized by a dataset-dependent mean and variance of log k. This result is supported by a recurring observation that microbial populations on leaves are log-normally distributed [3]. Simple biological processes cause the frequent appearance of the log-normal distribution in ecology [4]. Environmental factors, such as soil nitrate, soil aggregate size, soil hydraulic conductivity, total soil nitrogen, soil denitrification, soil respiration have been all observed to be log-normally distributed [5]. Litter degradation rates depend on many coupled, multiplicative factors, which provides a fundamental basis for the lognormal distribution. Using this insight, we systematically estimated the mean and variance of log k for 512 data sets from the LIDET study. We find the mean strongly correlates with temperature and precipitation, while the variance appears to be uncorrelated with main environmental factors and is thus likely more correlated with chemical composition and/or ecology. Results indicate the possibility that the distribution in rates reflects, at least in part, the distribution of microbial niches. [1] B. P. Boudreau, B.~R. Ruddick, American Journal of Science,291, 507, (1991). [2] M. Harmon, Forest Science Data Bank: TD023 [Database]. LTER Intersite Fine Litter Decomposition Experiment (LIDET): Long-Term Ecological Research, (2007). [3] G.~A. Beattie, S.~E. Lindow, Phytopathology 89, 353 (1999). [4] R.~A. May, Ecology and Evolution of Communities/, A pattern of Species Abundance and Diversity, 81 (1975). [5] T.~B. Parkin, J.~A. Robinson, Advances in Soil Science 20, Analysis of Lognormal Data, 194 (1992).
On the generation of log-Lévy distributions and extreme randomness
NASA Astrophysics Data System (ADS)
Eliazar, Iddo; Klafter, Joseph
2011-10-01
The log-normal distribution is prevalent across the sciences, as it emerges from the combination of multiplicative processes and the central limit theorem (CLT). The CLT, beyond yielding the normal distribution, also yields the class of Lévy distributions. The log-Lévy distributions are the Lévy counterparts of the log-normal distribution, they appear in the context of ultraslow diffusion processes, and they are categorized by Mandelbrot as belonging to the class of extreme randomness. In this paper, we present a natural stochastic growth model from which both the log-normal distribution and the log-Lévy distributions emerge universally—the former in the case of deterministic underlying setting, and the latter in the case of stochastic underlying setting. In particular, we establish a stochastic growth model which universally generates Mandelbrot’s extreme randomness.
Gradually truncated log-normal in USA publicly traded firm size distribution
NASA Astrophysics Data System (ADS)
Gupta, Hari M.; Campanha, José R.; de Aguiar, Daniela R.; Queiroz, Gabriel A.; Raheja, Charu G.
2007-03-01
We study the statistical distribution of firm size for USA and Brazilian publicly traded firms through the Zipf plot technique. Sale size is used to measure firm size. The Brazilian firm size distribution is given by a log-normal distribution without any adjustable parameter. However, we also need to consider different parameters of log-normal distribution for the largest firms in the distribution, which are mostly foreign firms. The log-normal distribution has to be gradually truncated after a certain critical value for USA firms. Therefore, the original hypothesis of proportional effect proposed by Gibrat is valid with some modification for very large firms. We also consider the possible mechanisms behind this distribution.
Log-normal distribution from a process that is not multiplicative but is additive.
Mouri, Hideaki
2013-10-01
The central limit theorem ensures that a sum of random variables tends to a Gaussian distribution as their total number tends to infinity. However, for a class of positive random variables, we find that the sum tends faster to a log-normal distribution. Although the sum tends eventually to a Gaussian distribution, the distribution of the sum is always close to a log-normal distribution rather than to any Gaussian distribution if the summands are numerous enough. This is in contrast to the current consensus that any log-normal distribution is due to a product of random variables, i.e., a multiplicative process, or equivalently to nonlinearity of the system. In fact, the log-normal distribution is also observable for a sum, i.e., an additive process that is typical of linear systems. We show conditions for such a sum, an analytical example, and an application to random scalar fields such as those of turbulence.
Chang, Wen-Ruey; Matz, Simon; Chang, Chien-Chi
2014-05-01
The maximum coefficient of friction that can be supported at the shoe and floor interface without a slip is usually called the available coefficient of friction (ACOF) for human locomotion. The probability of a slip could be estimated using a statistical model by comparing the ACOF with the required coefficient of friction (RCOF), assuming that both coefficients have stochastic distributions. An investigation of the stochastic distributions of the ACOF of five different floor surfaces under dry, water and glycerol conditions is presented in this paper. One hundred friction measurements were performed on each floor surface under each surface condition. The Kolmogorov-Smirnov goodness-of-fit test was used to determine if the distribution of the ACOF was a good fit with the normal, log-normal and Weibull distributions. The results indicated that the ACOF distributions had a slightly better match with the normal and log-normal distributions than with the Weibull in only three out of 15 cases with a statistical significance. The results are far more complex than what had heretofore been published and different scenarios could emerge. Since the ACOF is compared with the RCOF for the estimate of slip probability, the distribution of the ACOF in seven cases could be considered a constant for this purpose when the ACOF is much lower or higher than the RCOF. A few cases could be represented by a normal distribution for practical reasons based on their skewness and kurtosis values without a statistical significance. No representation could be found in three cases out of 15. Copyright © 2013 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Bellin, Alberto; Tonina, Daniele
2007-10-30
Available models of solute transport in heterogeneous formations lack in providing complete characterization of the predicted concentration. This is a serious drawback especially in risk analysis where confidence intervals and probability of exceeding threshold values are required. Our contribution to fill this gap of knowledge is a probability distribution model for the local concentration of conservative tracers migrating in heterogeneous aquifers. Our model accounts for dilution, mechanical mixing within the sampling volume and spreading due to formation heterogeneity. It is developed by modeling local concentration dynamics with an Ito Stochastic Differential Equation (SDE) that under the hypothesis of statistical stationarity leads to the Beta probability distribution function (pdf) for the solute concentration. This model shows large flexibility in capturing the smoothing effect of the sampling volume and the associated reduction of the probability of exceeding large concentrations. Furthermore, it is fully characterized by the first two moments of the solute concentration, and these are the same pieces of information required for standard geostatistical techniques employing Normal or Log-Normal distributions. Additionally, we show that in the absence of pore-scale dispersion and for point concentrations the pdf model converges to the binary distribution of [Dagan, G., 1982. Stochastic modeling of groundwater flow by unconditional and conditional probabilities, 2, The solute transport. Water Resour. Res. 18 (4), 835-848.], while it approaches the Normal distribution for sampling volumes much larger than the characteristic scale of the aquifer heterogeneity. Furthermore, we demonstrate that the same model with the spatial moments replacing the statistical moments can be applied to estimate the proportion of the plume volume where solute concentrations are above or below critical thresholds. Application of this model to point and vertically averaged bromide concentrations from the first Cape Cod tracer test and to a set of numerical simulations confirms the above findings and for the first time it shows the superiority of the Beta model to both Normal and Log-Normal models in interpreting field data. Furthermore, we show that assuming a-priori that local concentrations are normally or log-normally distributed may result in a severe underestimate of the probability of exceeding large concentrations.
Statistical distributions of ultra-low dose CT sinograms and their fundamental limits
NASA Astrophysics Data System (ADS)
Lee, Tzu-Cheng; Zhang, Ruoqiao; Alessio, Adam M.; Fu, Lin; De Man, Bruno; Kinahan, Paul E.
2017-03-01
Low dose CT imaging is typically constrained to be diagnostic. However, there are applications for even lowerdose CT imaging, including image registration across multi-frame CT images and attenuation correction for PET/CT imaging. We define this as the ultra-low-dose (ULD) CT regime where the exposure level is a factor of 10 lower than current low-dose CT technique levels. In the ULD regime it is possible to use statistically-principled image reconstruction methods that make full use of the raw data information. Since most statistical based iterative reconstruction methods are based on the assumption of that post-log noise distribution is close to Poisson or Gaussian, our goal is to understand the statistical distribution of ULD CT data with different non-positivity correction methods, and to understand when iterative reconstruction methods may be effective in producing images that are useful for image registration or attenuation correction in PET/CT imaging. We first used phantom measurement and calibrated simulation to reveal how the noise distribution deviate from normal assumption under the ULD CT flux environment. In summary, our results indicate that there are three general regimes: (1) Diagnostic CT, where post-log data are well modeled by normal distribution. (2) Lowdose CT, where normal distribution remains a reasonable approximation and statistically-principled (post-log) methods that assume a normal distribution have an advantage. (3) An ULD regime that is photon-starved and the quadratic approximation is no longer effective. For instance, a total integral density of 4.8 (ideal pi for 24 cm of water) for 120kVp, 0.5mAs of radiation source is the maximum pi value where a definitive maximum likelihood value could be found. This leads to fundamental limits in the estimation of ULD CT data when using a standard data processing stream
Parametric modelling of cost data in medical studies.
Nixon, R M; Thompson, S G
2004-04-30
The cost of medical resources used is often recorded for each patient in clinical studies in order to inform decision-making. Although cost data are generally skewed to the right, interest is in making inferences about the population mean cost. Common methods for non-normal data, such as data transformation, assuming asymptotic normality of the sample mean or non-parametric bootstrapping, are not ideal. This paper describes possible parametric models for analysing cost data. Four example data sets are considered, which have different sample sizes and degrees of skewness. Normal, gamma, log-normal, and log-logistic distributions are fitted, together with three-parameter versions of the latter three distributions. Maximum likelihood estimates of the population mean are found; confidence intervals are derived by a parametric BC(a) bootstrap and checked by MCMC methods. Differences between model fits and inferences are explored.Skewed parametric distributions fit cost data better than the normal distribution, and should in principle be preferred for estimating the population mean cost. However for some data sets, we find that models that fit badly can give similar inferences to those that fit well. Conversely, particularly when sample sizes are not large, different parametric models that fit the data equally well can lead to substantially different inferences. We conclude that inferences are sensitive to choice of statistical model, which itself can remain uncertain unless there is enough data to model the tail of the distribution accurately. Investigating the sensitivity of conclusions to choice of model should thus be an essential component of analysing cost data in practice. Copyright 2004 John Wiley & Sons, Ltd.
Neti, Prasad V.S.V.; Howell, Roger W.
2008-01-01
Recently, the distribution of radioactivity among a population of cells labeled with 210Po was shown to be well described by a log normal distribution function (J Nucl Med 47, 6 (2006) 1049-1058) with the aid of an autoradiographic approach. To ascertain the influence of Poisson statistics on the interpretation of the autoradiographic data, the present work reports on a detailed statistical analyses of these data. Methods The measured distributions of alpha particle tracks per cell were subjected to statistical tests with Poisson (P), log normal (LN), and Poisson – log normal (P – LN) models. Results The LN distribution function best describes the distribution of radioactivity among cell populations exposed to 0.52 and 3.8 kBq/mL 210Po-citrate. When cells were exposed to 67 kBq/mL, the P – LN distribution function gave a better fit, however, the underlying activity distribution remained log normal. Conclusions The present analysis generally provides further support for the use of LN distributions to describe the cellular uptake of radioactivity. Care should be exercised when analyzing autoradiographic data on activity distributions to ensure that Poisson processes do not distort the underlying LN distribution. PMID:16741316
Ordinal probability effect measures for group comparisons in multinomial cumulative link models.
Agresti, Alan; Kateri, Maria
2017-03-01
We consider simple ordinal model-based probability effect measures for comparing distributions of two groups, adjusted for explanatory variables. An "ordinal superiority" measure summarizes the probability that an observation from one distribution falls above an independent observation from the other distribution, adjusted for explanatory variables in a model. The measure applies directly to normal linear models and to a normal latent variable model for ordinal response variables. It equals Φ(β/2) for the corresponding ordinal model that applies a probit link function to cumulative multinomial probabilities, for standard normal cdf Φ and effect β that is the coefficient of the group indicator variable. For the more general latent variable model for ordinal responses that corresponds to a linear model with other possible error distributions and corresponding link functions for cumulative multinomial probabilities, the ordinal superiority measure equals exp(β)/[1+exp(β)] with the log-log link and equals approximately exp(β/2)/[1+exp(β/2)] with the logit link, where β is the group effect. Another ordinal superiority measure generalizes the difference of proportions from binary to ordinal responses. We also present related measures directly for ordinal models for the observed response that need not assume corresponding latent response models. We present confidence intervals for the measures and illustrate with an example. © 2016, The International Biometric Society.
Rockfall travel distances theoretical distributions
NASA Astrophysics Data System (ADS)
Jaboyedoff, Michel; Derron, Marc-Henri; Pedrazzini, Andrea
2017-04-01
The probability of propagation of rockfalls is a key part of hazard assessment, because it permits to extrapolate the probability of propagation of rockfall either based on partial data or simply theoretically. The propagation can be assumed frictional which permits to describe on average the propagation by a line of kinetic energy which corresponds to the loss of energy along the path. But loss of energy can also be assumed as a multiplicative process or a purely random process. The distributions of the rockfall block stop points can be deduced from such simple models, they lead to Gaussian, Inverse-Gaussian, Log-normal or exponential negative distributions. The theoretical background is presented, and the comparisons of some of these models with existing data indicate that these assumptions are relevant. The results are either based on theoretical considerations or by fitting results. They are potentially very useful for rockfall hazard zoning and risk assessment. This approach will need further investigations.
NASA Astrophysics Data System (ADS)
Iwata, Takaki; Yamazaki, Yoshihiro; Kuninaka, Hiroto
2013-08-01
In this study, we examine the validity of the transition of the human height distribution from the log-normal distribution to the normal distribution during puberty, as suggested in an earlier study [Kuninaka et al.: J. Phys. Soc. Jpn. 78 (2009) 125001]. Our data analysis reveals that, in late puberty, the variation in height decreases as children grow. Thus, the classification of a height dataset by age at this stage leads us to analyze a mixture of distributions with larger means and smaller variations. This mixture distribution has a negative skewness and is consequently closer to the normal distribution than to the log-normal distribution. The opposite case occurs in early puberty and the mixture distribution is positively skewed, which resembles the log-normal distribution rather than the normal distribution. Thus, this scenario mimics the transition during puberty. Additionally, our scenario is realized through a numerical simulation based on a statistical model. The present study does not support the transition suggested by the earlier study.
Distribution Functions of Sizes and Fluxes Determined from Supra-Arcade Downflows
NASA Technical Reports Server (NTRS)
McKenzie, D.; Savage, S.
2011-01-01
The frequency distributions of sizes and fluxes of supra-arcade downflows (SADs) provide information about the process of their creation. For example, a fractal creation process may be expected to yield a power-law distribution of sizes and/or fluxes. We examine 120 cross-sectional areas and magnetic flux estimates found by Savage & McKenzie for SADs, and find that (1) the areas are consistent with a log-normal distribution and (2) the fluxes are consistent with both a log-normal and an exponential distribution. Neither set of measurements is compatible with a power-law distribution nor a normal distribution. As a demonstration of the applicability of these findings to improved understanding of reconnection, we consider a simple SAD growth scenario with minimal assumptions, capable of producing a log-normal distribution.
WE-H-207A-03: The Universality of the Lognormal Behavior of [F-18]FLT PET SUV Measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scarpelli, M; Eickhoff, J; Perlman, S
Purpose: Log transforming [F-18]FDG PET standardized uptake values (SUVs) has been shown to lead to normal SUV distributions, which allows utilization of powerful parametric statistical models. This study identified the optimal transformation leading to normally distributed [F-18]FLT PET SUVs from solid tumors and offers an example of how normal distributions permits analysis of non-independent/correlated measurements. Methods: Forty patients with various metastatic diseases underwent up to six FLT PET/CT scans during treatment. Tumors were identified by nuclear medicine physician and manually segmented. Average uptake was extracted for each patient giving a global SUVmean (gSUVmean) for each scan. The Shapiro-Wilk test wasmore » used to test distribution normality. One parameter Box-Cox transformations were applied to each of the six gSUVmean distributions and the optimal transformation was found by selecting the parameter that maximized the Shapiro-Wilk test statistic. The relationship between gSUVmean and a serum biomarker (VEGF) collected at imaging timepoints was determined using a linear mixed effects model (LMEM), which accounted for correlated/non-independent measurements from the same individual. Results: Untransformed gSUVmean distributions were found to be significantly non-normal (p<0.05). The optimal transformation parameter had a value of 0.3 (95%CI: −0.4 to 1.6). Given the optimal parameter was close to zero (which corresponds to log transformation), the data were subsequently log transformed. All log transformed gSUVmean distributions were normally distributed (p>0.10 for all timepoints). Log transformed data were incorporated into the LMEM. VEGF serum levels significantly correlated with gSUVmean (p<0.001), revealing log-linear relationship between SUVs and underlying biology. Conclusion: Failure to account for correlated/non-independent measurements can lead to invalid conclusions and motivated transformation to normally distributed SUVs. The log transformation was found to be close to optimal and sufficient for obtaining normally distributed FLT PET SUVs. These transformations allow utilization of powerful LMEMs when analyzing quantitative imaging metrics.« less
Empirical analysis on the runners' velocity distribution in city marathons
NASA Astrophysics Data System (ADS)
Lin, Zhenquan; Meng, Fan
2018-01-01
In recent decades, much researches have been performed on human temporal activity and mobility patterns, while few investigations have been made to examine the features of the velocity distributions of human mobility patterns. In this paper, we investigated empirically the velocity distributions of finishers in New York City marathon, American Chicago marathon, Berlin marathon and London marathon. By statistical analyses on the datasets of the finish time records, we captured some statistical features of human behaviors in marathons: (1) The velocity distributions of all finishers and of partial finishers in the fastest age group both follow log-normal distribution; (2) In the New York City marathon, the velocity distribution of all male runners in eight 5-kilometer internal timing courses undergoes two transitions: from log-normal distribution at the initial stage (several initial courses) to the Gaussian distribution at the middle stage (several middle courses), and to log-normal distribution at the last stage (several last courses); (3) The intensity of the competition, which is described by the root-mean-square value of the rank changes of all runners, goes weaker from initial stage to the middle stage corresponding to the transition of the velocity distribution from log-normal distribution to Gaussian distribution, and when the competition gets stronger in the last course of the middle stage, there will come a transition from Gaussian distribution to log-normal one at last stage. This study may enrich the researches on human mobility patterns and attract attentions on the velocity features of human mobility.
powerbox: Arbitrarily structured, arbitrary-dimension boxes and log-normal mocks
NASA Astrophysics Data System (ADS)
Murray, Steven G.
2018-05-01
powerbox creates density grids (or boxes) with an arbitrary two-point distribution (i.e. power spectrum). The software works in any number of dimensions, creates Gaussian or Log-Normal fields, and measures power spectra of output fields to ensure consistency. The primary motivation for creating the code was the simple creation of log-normal mock galaxy distributions, but the methodology can be used for other applications.
Estimating sales and sales market share from sales rank data for consumer appliances
NASA Astrophysics Data System (ADS)
Touzani, Samir; Van Buskirk, Robert
2016-06-01
Our motivation in this work is to find an adequate probability distribution to fit sales volumes of different appliances. This distribution allows for the translation of sales rank into sales volume. This paper shows that the log-normal distribution and specifically the truncated version are well suited for this purpose. We demonstrate that using sales proxies derived from a calibrated truncated log-normal distribution function can be used to produce realistic estimates of market average product prices, and product attributes. We show that the market averages calculated with the sales proxies derived from the calibrated, truncated log-normal distribution provide better market average estimates than sales proxies estimated with simpler distribution functions.
Log-Normal Distribution of Cosmic Voids in Simulations and Mocks
NASA Astrophysics Data System (ADS)
Russell, E.; Pycke, J.-R.
2017-01-01
Following up on previous studies, we complete here a full analysis of the void size distributions of the Cosmic Void Catalog based on three different simulation and mock catalogs: dark matter (DM), haloes, and galaxies. Based on this analysis, we attempt to answer two questions: Is a three-parameter log-normal distribution a good candidate to satisfy the void size distributions obtained from different types of environments? Is there a direct relation between the shape parameters of the void size distribution and the environmental effects? In an attempt to answer these questions, we find here that all void size distributions of these data samples satisfy the three-parameter log-normal distribution whether the environment is dominated by DM, haloes, or galaxies. In addition, the shape parameters of the three-parameter log-normal void size distribution seem highly affected by environment, particularly existing substructures. Therefore, we show two quantitative relations given by linear equations between the skewness and the maximum tree depth, and between the variance of the void size distribution and the maximum tree depth, directly from the simulated data. In addition to this, we find that the percentage of voids with nonzero central density in the data sets has a critical importance. If the number of voids with nonzero central density reaches ≥3.84% in a simulation/mock sample, then a second population is observed in the void size distributions. This second population emerges as a second peak in the log-normal void size distribution at larger radius.
Aldega, L.; Eberl, D.D.
2005-01-01
Illite crystals in siliciclastic sediments are heterogeneous assemblages of detrital material coming from various source rocks and, at paleotemperatures >70 ??C, of superimposed diagenetic modification in the parent sediment. We distinguished the relative proportions of 2M1 detrital illite and possible diagenetic 1Md + 1M illite by a combined analysis of crystal-size distribution and illite polytype quantification. We found that the proportions of 1Md + 1M and 2M1 illite could be determined from crystallite thickness measurements (BWA method, using the MudMaster program) by unmixing measured crystallite thickness distributions using theoretical and calculated log-normal and/or asymptotic distributions. The end-member components that we used to unmix the measured distributions were three asymptotic-shaped distributions (assumed to be the diagenetic component of the mixture, the 1Md + 1M polytypes) calculated using the Galoper program (Phase A was simulated using 500 crystals per cycle of nucleation and growth, Phase B = 333/cycle, and Phase C = 250/ cycle), and one theoretical log-normal distribution (Phase D, assumed to approximate the detrital 2M1 component of the mixture). In addition, quantitative polytype analysis was carried out using the RockJock software for comparison. The two techniques gave comparable results (r2 = 0.93), which indicates that the unmixing method permits one to calculate the proportion of illite polytypes and, therefore, the proportion of 2M1 detrital illite, from crystallite thickness measurements. The overall illite crystallite thicknesses in the samples were found to be a function of the relative proportions of thick 2M1 and thin 1Md + 1M illite. The percentage of illite layers in I-S mixed layers correlates with the mean crystallite thickness of the 1Md + 1M polytypes, indicating that these polytypes, rather than the 2M1 polytype, participate in I-S mixed layering.
NASA Astrophysics Data System (ADS)
Wang, Yu; Fan, Jie; Xu, Ye; Sun, Wei; Chen, Dong
2018-05-01
In this study, an inexact log-normal-based stochastic chance-constrained programming model was developed for solving the non-point source pollution issues caused by agricultural activities. Compared to the general stochastic chance-constrained programming model, the main advantage of the proposed model is that it allows random variables to be expressed as a log-normal distribution, rather than a general normal distribution. Possible deviations in solutions caused by irrational parameter assumptions were avoided. The agricultural system management in the Erhai Lake watershed was used as a case study, where critical system factors, including rainfall and runoff amounts, show characteristics of a log-normal distribution. Several interval solutions were obtained under different constraint-satisfaction levels, which were useful in evaluating the trade-off between system economy and reliability. The applied results show that the proposed model could help decision makers to design optimal production patterns under complex uncertainties. The successful application of this model is expected to provide a good example for agricultural management in many other watersheds.
LOG-NORMAL DISTRIBUTION OF COSMIC VOIDS IN SIMULATIONS AND MOCKS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Russell, E.; Pycke, J.-R., E-mail: er111@nyu.edu, E-mail: jrp15@nyu.edu
2017-01-20
Following up on previous studies, we complete here a full analysis of the void size distributions of the Cosmic Void Catalog based on three different simulation and mock catalogs: dark matter (DM), haloes, and galaxies. Based on this analysis, we attempt to answer two questions: Is a three-parameter log-normal distribution a good candidate to satisfy the void size distributions obtained from different types of environments? Is there a direct relation between the shape parameters of the void size distribution and the environmental effects? In an attempt to answer these questions, we find here that all void size distributions of thesemore » data samples satisfy the three-parameter log-normal distribution whether the environment is dominated by DM, haloes, or galaxies. In addition, the shape parameters of the three-parameter log-normal void size distribution seem highly affected by environment, particularly existing substructures. Therefore, we show two quantitative relations given by linear equations between the skewness and the maximum tree depth, and between the variance of the void size distribution and the maximum tree depth, directly from the simulated data. In addition to this, we find that the percentage of voids with nonzero central density in the data sets has a critical importance. If the number of voids with nonzero central density reaches ≥3.84% in a simulation/mock sample, then a second population is observed in the void size distributions. This second population emerges as a second peak in the log-normal void size distribution at larger radius.« less
Understanding a Normal Distribution of Data.
Maltenfort, Mitchell G
2015-12-01
Assuming data follow a normal distribution is essential for many common statistical tests. However, what are normal data and when can we assume that a data set follows this distribution? What can be done to analyze non-normal data?
Levine, M W
1991-01-01
Simulated neural impulse trains were generated by a digital realization of the integrate-and-fire model. The variability in these impulse trains had as its origin a random noise of specified distribution. Three different distributions were used: the normal (Gaussian) distribution (no skew, normokurtic), a first-order gamma distribution (positive skew, leptokurtic), and a uniform distribution (no skew, platykurtic). Despite these differences in the distribution of the variability, the distributions of the intervals between impulses were nearly indistinguishable. These inter-impulse distributions were better fit with a hyperbolic gamma distribution than a hyperbolic normal distribution, although one might expect a better approximation for normally distributed inverse intervals. Consideration of why the inter-impulse distribution is independent of the distribution of the causative noise suggests two putative interval distributions that do not depend on the assumed noise distribution: the log normal distribution, which is predicated on the assumption that long intervals occur with the joint probability of small input values, and the random walk equation, which is the diffusion equation applied to a random walk model of the impulse generating process. Either of these equations provides a more satisfactory fit to the simulated impulse trains than the hyperbolic normal or hyperbolic gamma distributions. These equations also provide better fits to impulse trains derived from the maintained discharges of ganglion cells in the retinae of cats or goldfish. It is noted that both equations are free from the constraint that the coefficient of variation (CV) have a maximum of unity.(ABSTRACT TRUNCATED AT 250 WORDS)
Population Synthesis of Radio and Y-ray Normal, Isolated Pulsars Using Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Billman, Caleb; Gonthier, P. L.; Harding, A. K.
2013-04-01
We present preliminary results of a population statistics study of normal pulsars (NP) from the Galactic disk using Markov Chain Monte Carlo techniques optimized according to two different methods. The first method compares the detected and simulated cumulative distributions of series of pulsar characteristics, varying the model parameters to maximize the overall agreement. The advantage of this method is that the distributions do not have to be binned. The other method varies the model parameters to maximize the log of the maximum likelihood obtained from the comparisons of four-two dimensional distributions of radio and γ-ray pulsar characteristics. The advantage of this method is that it provides a confidence region of the model parameter space. The computer code simulates neutron stars at birth using Monte Carlo procedures and evolves them to the present assuming initial spatial, kick velocity, magnetic field, and period distributions. Pulsars are spun down to the present and given radio and γ-ray emission characteristics, implementing an empirical γ-ray luminosity model. A comparison group of radio NPs detected in ten-radio surveys is used to normalize the simulation, adjusting the model radio luminosity to match a birth rate. We include the Fermi pulsars in the forthcoming second pulsar catalog. We present preliminary results comparing the simulated and detected distributions of radio and γ-ray NPs along with a confidence region in the parameter space of the assumed models. We express our gratitude for the generous support of the National Science Foundation (REU and RUI), Fermi Guest Investigator Program and the NASA Astrophysics Theory and Fundamental Program.
Sileshi, G
2006-10-01
Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stål and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.
NASA Astrophysics Data System (ADS)
Annunziata, Mario Alberto; Petri, Alberto; Pontuale, Giorgio; Zaccaria, Andrea
2016-10-01
We have considered the statistical distributions of the volumes of 1131 products exported by 148 countries. We have found that the form of these distributions is not unique but heavily depends on the level of development of the nation, as expressed by macroeconomic indicators like GDP, GDP per capita, total export and a recently introduced measure for countries' economic complexity called fitness. We have identified three major classes: a) an incomplete log-normal shape, truncated on the left side, for the less developed countries, b) a complete log-normal, with a wider range of volumes, for nations characterized by intermediate economy, and c) a strongly asymmetric shape for countries with a high degree of development. Finally, the log-normality hypothesis has been checked for the distributions of all the 148 countries through different tests, Kolmogorov-Smirnov and Cramér-Von Mises, confirming that it cannot be rejected only for the countries of intermediate economy.
Log-Normal Turbulence Dissipation in Global Ocean Models
NASA Astrophysics Data System (ADS)
Pearson, Brodie; Fox-Kemper, Baylor
2018-03-01
Data from turbulent numerical simulations of the global ocean demonstrate that the dissipation of kinetic energy obeys a nearly log-normal distribution even at large horizontal scales O (10 km ) . As the horizontal scales of resolved turbulence are larger than the ocean is deep, the Kolmogorov-Yaglom theory for intermittency in 3D homogeneous, isotropic turbulence cannot apply; instead, the down-scale potential enstrophy cascade of quasigeostrophic turbulence should. Yet, energy dissipation obeys approximate log-normality—robustly across depths, seasons, regions, and subgrid schemes. The distribution parameters, skewness and kurtosis, show small systematic departures from log-normality with depth and subgrid friction schemes. Log-normality suggests that a few high-dissipation locations dominate the integrated energy and enstrophy budgets, which should be taken into account when making inferences from simplified models and inferring global energy budgets from sparse observations.
Frequency distribution of lithium in leaves of Lycium andersonii
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romney, E.M.; Wallace, A.; Kinnear, J.
1977-01-01
Lycium andersonii A. Gray is an accumulator of Li. Assays were made of 200 samples of it collected from six different locations within the Northern Mojave Desert. Mean concentrations of Li varied from location to location and tended not to follow log/sub e/ normal distribution, and to follow a normal distribution only poorly. There was some negative skewness to the log/sub e/ distribution which did exist. The results imply that the variation in accumulation of Li depends upon native supply of Li. Possibly the Li supply and the ability of L. andersonii plants to accumulate it are both log/sub e/more » normally distributed. The mean leaf concentration of Li in all locations was 29 ..mu..g/g, but the maximum was 166 ..mu..g/g.« less
Scarpelli, Matthew; Eickhoff, Jens; Cuna, Enrique; Perlman, Scott; Jeraj, Robert
2018-01-30
The statistical analysis of positron emission tomography (PET) standardized uptake value (SUV) measurements is challenging due to the skewed nature of SUV distributions. This limits utilization of powerful parametric statistical models for analyzing SUV measurements. An ad-hoc approach, which is frequently used in practice, is to blindly use a log transformation, which may or may not result in normal SUV distributions. This study sought to identify optimal transformations leading to normally distributed PET SUVs extracted from tumors and assess the effects of therapy on the optimal transformations. The optimal transformation for producing normal distributions of tumor SUVs was identified by iterating the Box-Cox transformation parameter (λ) and selecting the parameter that maximized the Shapiro-Wilk P-value. Optimal transformations were identified for tumor SUV max distributions at both pre and post treatment. This study included 57 patients that underwent 18 F-fluorodeoxyglucose ( 18 F-FDG) PET scans (publically available dataset). In addition, to test the generality of our transformation methodology, we included analysis of 27 patients that underwent 18 F-Fluorothymidine ( 18 F-FLT) PET scans at our institution. After applying the optimal Box-Cox transformations, neither the pre nor the post treatment 18 F-FDG SUV distributions deviated significantly from normality (P > 0.10). Similar results were found for 18 F-FLT PET SUV distributions (P > 0.10). For both 18 F-FDG and 18 F-FLT SUV distributions, the skewness and kurtosis increased from pre to post treatment, leading to a decrease in the optimal Box-Cox transformation parameter from pre to post treatment. There were types of distributions encountered for both 18 F-FDG and 18 F-FLT where a log transformation was not optimal for providing normal SUV distributions. Optimization of the Box-Cox transformation, offers a solution for identifying normal SUV transformations for when the log transformation is insufficient. The log transformation is not always the appropriate transformation for producing normally distributed PET SUVs.
NASA Astrophysics Data System (ADS)
Scarpelli, Matthew; Eickhoff, Jens; Cuna, Enrique; Perlman, Scott; Jeraj, Robert
2018-02-01
The statistical analysis of positron emission tomography (PET) standardized uptake value (SUV) measurements is challenging due to the skewed nature of SUV distributions. This limits utilization of powerful parametric statistical models for analyzing SUV measurements. An ad-hoc approach, which is frequently used in practice, is to blindly use a log transformation, which may or may not result in normal SUV distributions. This study sought to identify optimal transformations leading to normally distributed PET SUVs extracted from tumors and assess the effects of therapy on the optimal transformations. Methods. The optimal transformation for producing normal distributions of tumor SUVs was identified by iterating the Box-Cox transformation parameter (λ) and selecting the parameter that maximized the Shapiro-Wilk P-value. Optimal transformations were identified for tumor SUVmax distributions at both pre and post treatment. This study included 57 patients that underwent 18F-fluorodeoxyglucose (18F-FDG) PET scans (publically available dataset). In addition, to test the generality of our transformation methodology, we included analysis of 27 patients that underwent 18F-Fluorothymidine (18F-FLT) PET scans at our institution. Results. After applying the optimal Box-Cox transformations, neither the pre nor the post treatment 18F-FDG SUV distributions deviated significantly from normality (P > 0.10). Similar results were found for 18F-FLT PET SUV distributions (P > 0.10). For both 18F-FDG and 18F-FLT SUV distributions, the skewness and kurtosis increased from pre to post treatment, leading to a decrease in the optimal Box-Cox transformation parameter from pre to post treatment. There were types of distributions encountered for both 18F-FDG and 18F-FLT where a log transformation was not optimal for providing normal SUV distributions. Conclusion. Optimization of the Box-Cox transformation, offers a solution for identifying normal SUV transformations for when the log transformation is insufficient. The log transformation is not always the appropriate transformation for producing normally distributed PET SUVs.
Generating log-normal mock catalog of galaxies in redshift space
NASA Astrophysics Data System (ADS)
Agrawal, Aniket; Makiya, Ryu; Chiang, Chi-Ting; Jeong, Donghui; Saito, Shun; Komatsu, Eiichiro
2017-10-01
We present a public code to generate a mock galaxy catalog in redshift space assuming a log-normal probability density function (PDF) of galaxy and matter density fields. We draw galaxies by Poisson-sampling the log-normal field, and calculate the velocity field from the linearised continuity equation of matter fields, assuming zero vorticity. This procedure yields a PDF of the pairwise velocity fields that is qualitatively similar to that of N-body simulations. We check fidelity of the catalog, showing that the measured two-point correlation function and power spectrum in real space agree with the input precisely. We find that a linear bias relation in the power spectrum does not guarantee a linear bias relation in the density contrasts, leading to a cross-correlation coefficient of matter and galaxies deviating from unity on small scales. We also find that linearising the Jacobian of the real-to-redshift space mapping provides a poor model for the two-point statistics in redshift space. That is, non-linear redshift-space distortion is dominated by non-linearity in the Jacobian. The power spectrum in redshift space shows a damping on small scales that is qualitatively similar to that of the well-known Fingers-of-God (FoG) effect due to random velocities, except that the log-normal mock does not include random velocities. This damping is a consequence of non-linearity in the Jacobian, and thus attributing the damping of the power spectrum solely to FoG, as commonly done in the literature, is misleading.
NASA Astrophysics Data System (ADS)
Duarte Queirós, Sílvio M.
2012-07-01
We discuss the modification of the Kapteyn multiplicative process using the q-product of Borges [E.P. Borges, A possible deformed algebra and calculus inspired in nonextensive thermostatistics, Physica A 340 (2004) 95]. Depending on the value of the index q a generalisation of the log-Normal distribution is yielded. Namely, the distribution increases the tail for small (when q<1) or large (when q>1) values of the variable upon analysis. The usual log-Normal distribution is retrieved when q=1, which corresponds to the traditional Kapteyn multiplicative process. The main statistical features of this distribution as well as related random number generators and tables of quantiles of the Kolmogorov-Smirnov distance are presented. Finally, we illustrate the validity of this scenario by describing a set of variables of biological and financial origin.
Usuda, Kan; Kono, Koichi; Dote, Tomotaro; Shimizu, Hiroyasu; Tominaga, Mika; Koizumi, Chisato; Nakase, Emiko; Toshina, Yumi; Iwai, Junko; Kawasaki, Takashi; Akashi, Mitsuya
2002-04-01
In previous article, we showed a log-normal distribution of boron and lithium in human urine. This type of distribution is common in both biological and nonbiological applications. It can be observed when the effects of many independent variables are combined, each of which having any underlying distribution. Although elemental excretion depends on many variables, the one-compartment open model following a first-order process can be used to explain the elimination of elements. The rate of excretion is proportional to the amount present of any given element; that is, the same percentage of an existing element is eliminated per unit time, and the element concentration is represented by a deterministic negative power function of time in the elimination time-course. Sampling is of a stochastic nature, so the dataset of time variables in the elimination phase when the sample was obtained is expected to show Normal distribution. The time variable appears as an exponent of the power function, so a concentration histogram is that of an exponential transformation of Normally distributed time. This is the reason why the element concentration shows a log-normal distribution. The distribution is determined not by the element concentration itself, but by the time variable that defines the pharmacokinetic equation.
ERIC Educational Resources Information Center
Xu, Xueli; von Davier, Matthias
2008-01-01
The general diagnostic model (GDM) utilizes located latent classes for modeling a multidimensional proficiency variable. In this paper, the GDM is extended by employing a log-linear model for multiple populations that assumes constraints on parameters across multiple groups. This constrained model is compared to log-linear models that assume…
Scoring in genetically modified organism proficiency tests based on log-transformed results.
Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P
2006-01-01
The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.
Generating log-normal mock catalog of galaxies in redshift space
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agrawal, Aniket; Makiya, Ryu; Saito, Shun
We present a public code to generate a mock galaxy catalog in redshift space assuming a log-normal probability density function (PDF) of galaxy and matter density fields. We draw galaxies by Poisson-sampling the log-normal field, and calculate the velocity field from the linearised continuity equation of matter fields, assuming zero vorticity. This procedure yields a PDF of the pairwise velocity fields that is qualitatively similar to that of N-body simulations. We check fidelity of the catalog, showing that the measured two-point correlation function and power spectrum in real space agree with the input precisely. We find that a linear biasmore » relation in the power spectrum does not guarantee a linear bias relation in the density contrasts, leading to a cross-correlation coefficient of matter and galaxies deviating from unity on small scales. We also find that linearising the Jacobian of the real-to-redshift space mapping provides a poor model for the two-point statistics in redshift space. That is, non-linear redshift-space distortion is dominated by non-linearity in the Jacobian. The power spectrum in redshift space shows a damping on small scales that is qualitatively similar to that of the well-known Fingers-of-God (FoG) effect due to random velocities, except that the log-normal mock does not include random velocities. This damping is a consequence of non-linearity in the Jacobian, and thus attributing the damping of the power spectrum solely to FoG, as commonly done in the literature, is misleading.« less
Serebrianyĭ, A M; Akleev, A V; Aleshchenko, A V; Antoshchina, M M; Kudriashova, O V; Riabchenko, N I; Semenova, L P; Pelevina, I I
2011-01-01
By micronucleus (MN) assay with cytokinetic cytochalasin B block, the mean frequency of blood lymphocytes with MN has been determined in 76 Moscow inhabitants, 35 people from Obninsk and 122 from Chelyabinsk region. In contrast to the distribution of individuals on spontaneous frequency of cells with aberrations, which was shown to be binomial (Kusnetzov et al., 1980), the distribution of individuals on the spontaneous frequency of cells with MN in all three massif can be acknowledged as log-normal (chi2 test). Distribution of individuals in the joined massifs (Moscow and Obninsk inhabitants) and in the unique massif of all inspected with great reliability must be acknowledged as log-normal (0.70 and 0.86 correspondingly), but it cannot be regarded as Poisson, binomial or normal. Taking into account that log-normal distribution of children by spontaneous frequency of lymphocytes with MN has been observed by the inspection of 473 children from different kindergartens in Moscow we can make the conclusion that log-normal is regularity inherent in this type of damage of lymphocytes genome. On the contrary the distribution of individuals on induced by irradiation in vitro lymphocytes with MN frequency in most cases must be acknowledged as normal. This distribution character points out that damage appearance in the individual (genomic instability) in a single lymphocytes increases the probability of the damage appearance in another lymphocytes. We can propose that damaged stem cells lymphocyte progenitor's exchange by information with undamaged cells--the type of the bystander effect process. It can also be supposed that transmission of damage to daughter cells occurs in the time of stem cells division.
Modeling Error Distributions of Growth Curve Models through Bayesian Methods
ERIC Educational Resources Information Center
Zhang, Zhiyong
2016-01-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is…
NASA Astrophysics Data System (ADS)
Wang, Huiqin; Wang, Xue; Lynette, Kibe; Cao, Minghua
2018-06-01
The performance of multiple-input multiple-output wireless optical communication systems that adopt Q-ary pulse position modulation over spatial correlated log-normal fading channel is analyzed in terms of its un-coded bit error rate and ergodic channel capacity. The analysis is based on the Wilkinson's method which approximates the distribution of a sum of correlated log-normal random variables to a log-normal random variable. The analytical and simulation results corroborate the increment of correlation coefficients among sub-channels lead to system performance degradation. Moreover, the receiver diversity has better performance in resistance of spatial correlation caused channel fading.
Ventilation-perfusion distribution in normal subjects.
Beck, Kenneth C; Johnson, Bruce D; Olson, Thomas P; Wilson, Theodore A
2012-09-01
Functional values of LogSD of the ventilation distribution (σ(V)) have been reported previously, but functional values of LogSD of the perfusion distribution (σ(q)) and the coefficient of correlation between ventilation and perfusion (ρ) have not been measured in humans. Here, we report values for σ(V), σ(q), and ρ obtained from wash-in data for three gases, helium and two soluble gases, acetylene and dimethyl ether. Normal subjects inspired gas containing the test gases, and the concentrations of the gases at end-expiration during the first 10 breaths were measured with the subjects at rest and at increasing levels of exercise. The regional distribution of ventilation and perfusion was described by a bivariate log-normal distribution with parameters σ(V), σ(q), and ρ, and these parameters were evaluated by matching the values of expired gas concentrations calculated for this distribution to the measured values. Values of cardiac output and LogSD ventilation/perfusion (Va/Q) were obtained. At rest, σ(q) is high (1.08 ± 0.12). With the onset of ventilation, σ(q) decreases to 0.85 ± 0.09 but remains higher than σ(V) (0.43 ± 0.09) at all exercise levels. Rho increases to 0.87 ± 0.07, and the value of LogSD Va/Q for light and moderate exercise is primarily the result of the difference between the magnitudes of σ(q) and σ(V). With known values for the parameters, the bivariate distribution describes the comprehensive distribution of ventilation and perfusion that underlies the distribution of the Va/Q ratio.
Incubation period of ebola hemorrhagic virus subtype zaire.
Eichner, Martin; Dowell, Scott F; Firese, Nina
2011-06-01
Ebola hemorrhagic fever has killed over 1300 people, mostly in equatorial Africa. There is still uncertainty about the natural reservoir of the virus and about some of the factors involved in disease transmission. Until now, a maximum incubation period of 21 days has been assumed. We analyzed data collected during the Ebola outbreak (subtype Zaire) in Kikwit, Democratic Republic of the Congo, in 1995 using maximum likelihood inference and assuming a log-normally distributed incubation period. The mean incubation period was estimated to be 12.7 days (standard deviation 4.31 days), indicating that about 4.1% of patients may have incubation periods longer than 21 days. If the risk of new cases is to be reduced to 1% then 25 days should be used when investigating the source of an outbreak, when determining the duration of surveillance for contacts, and when declaring the end of an outbreak.
Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D
2017-01-01
If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
Size distribution of radon daughter particles in uranium mine atmospheres.
George, A C; Hinchliffe, L; Sladowski, R
1975-06-01
The size distribution of radon daughters was measured in several uranium mines using four compact diffusion batteries and a round jet cascade impactor. Simultaneously, measurements were made of uncombined fractions of radon daughters, radon concentration, working level and particle concentration. The size distributions found for radon daughters were log normal. The activity median diameters ranged from 0.09 mum to 0.3 mum with a mean value of 0.17 mum. Geometric standard deviations were in the range from 1.3 to 4 with a mean value of 2.7. Uncombined fractions expressed in accordance with the ICRP definition ranged from 0.004 to 0.16 with a mean value of 0.04. The radon daughter sizes in these mines are greater than the sizes assumed by various authors in calculating respiratory tract dose. The disparity may reflect the widening use of diesel-powered equipment in large uranium mines.
Reliability Analysis of the Gradual Degradation of Semiconductor Devices.
1983-07-20
under the heading of linear models or linear statistical models . 3 ,4 We have not used this material in this report. Assuming catastrophic failure when...assuming a catastrophic model . In this treatment we first modify our system loss formula and then proceed to the actual analysis. II. ANALYSIS OF...Failure Time 1 Ti Ti 2 T2 T2 n Tn n and are easily analyzed by simple linear regression. Since we have assumed a log normal/Arrhenius activation
[Quantitative study of diesel/CNG buses exhaust particulate size distribution in a road tunnel].
Zhu, Chun; Zhang, Xu
2010-10-01
Vehicle emission is one of main sources of fine/ultra-fine particles in many cities. This study firstly presents daily mean particle size distributions of mixed diesel/CNG buses traffic flow by 4 days consecutive real world measurement in an Australia road tunnel. Emission factors (EFs) of particle size distribution of diesel buses and CNG buses are obtained by MLR methods, particle distributions of diesel buses and CNG buses are observed as single accumulation mode and nuclei-mode separately. Particle size distributions of mixed traffic flow are decomposed by two log-normal fitting curves for each 30 min interval mean scans, the degrees of fitting between combined fitting curves and corresponding in-situ scans for totally 90 fitting scans are from 0.972 to 0.998. Finally particle size distributions of diesel buses and CNG buses are quantified by statistical whisker-box charts. For log-normal particle size distribution of diesel buses, accumulation mode diameters are 74.5-86.5 nm, geometric standard deviations are 1.88-2.05. As to log-normal particle size distribution of CNG buses, nuclei-mode diameters are 19.9-22.9 nm, geometric standard deviations are 1.27-1.3.
Tavakol, Najmeh; Kheiri, Soleiman; Sedehi, Morteza
2016-01-01
Time to donating blood plays a major role in a regular donor to becoming continues one. The aim of this study was to determine the effective factors on the interval between the blood donations. In a longitudinal study in 2008, 864 samples of first-time donors in Shahrekord Blood Transfusion Center, capital city of Chaharmahal and Bakhtiari Province, Iran were selected by a systematic sampling and were followed up for five years. Among these samples, a subset of 424 donors who had at least two successful blood donations were chosen for this study and the time intervals between their donations were measured as response variable. Sex, body weight, age, marital status, education, stay and job were recorded as independent variables. Data analysis was performed based on log-normal hazard model with gamma correlated frailty. In this model, the frailties are sum of two independent components assumed a gamma distribution. The analysis was done via Bayesian approach using Markov Chain Monte Carlo algorithm by OpenBUGS. Convergence was checked via Gelman-Rubin criteria using BOA program in R. Age, job and education were significant on chance to donate blood (P<0.05). The chances of blood donation for the higher-aged donors, clericals, workers, free job, students and educated donors were higher and in return, time intervals between their blood donations were shorter. Due to the significance effect of some variables in the log-normal correlated frailty model, it is necessary to plan educational and cultural program to encourage the people with longer inter-donation intervals to donate more frequently.
Advanced information processing system
NASA Technical Reports Server (NTRS)
Lala, J. H.
1984-01-01
Design and performance details of the advanced information processing system (AIPS) for fault and damage tolerant data processing on aircraft and spacecraft are presented. AIPS comprises several computers distributed throughout the vehicle and linked by a damage tolerant data bus. Most I/O functions are available to all the computers, which run in a TDMA mode. Each computer performs separate specific tasks in normal operation and assumes other tasks in degraded modes. Redundant software assures that all fault monitoring, logging and reporting are automated, together with control functions. Redundant duplex links and damage-spread limitation provide the fault tolerance. Details of an advanced design of a laboratory-scale proof-of-concept system are described, including functional operations.
Stick-slip behavior in a continuum-granular experiment.
Geller, Drew A; Ecke, Robert E; Dahmen, Karin A; Backhaus, Scott
2015-12-01
We report moment distribution results from a laboratory experiment, similar in character to an isolated strike-slip earthquake fault, consisting of sheared elastic plates separated by a narrow gap filled with a two-dimensional granular medium. Local measurement of strain displacements of the plates at 203 spatial points located adjacent to the gap allows direct determination of the event moments and their spatial and temporal distributions. We show that events consist of spatially coherent, larger motions and spatially extended (noncoherent), smaller events. The noncoherent events have a probability distribution of event moment consistent with an M(-3/2) power law scaling with Poisson-distributed recurrence times. Coherent events have a log-normal moment distribution and mean temporal recurrence. As the applied normal pressure increases, there are more coherent events and their log-normal distribution broadens and shifts to larger average moment.
Comparison of parametric and bootstrap method in bioequivalence test.
Ahn, Byung-Jin; Yim, Dong-Seok
2009-10-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption.
Comparison of Parametric and Bootstrap Method in Bioequivalence Test
Ahn, Byung-Jin
2009-01-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption. PMID:19915699
Optimal message log reclamation for independent checkpointing
NASA Technical Reports Server (NTRS)
Wang, Yi-Min; Fuchs, W. Kent
1993-01-01
Independent (uncoordinated) check pointing for parallel and distributed systems allows maximum process autonomy but suffers from possible domino effects and the associated storage space overhead for maintaining multiple checkpoints and message logs. In most research on check pointing and recovery, it was assumed that only the checkpoints and message logs older than the global recovery line can be discarded. It is shown how recovery line transformation and decomposition can be applied to the problem of efficiently identifying all discardable message logs, thereby achieving optimal garbage collection. Communication trace-driven simulation for several parallel programs is used to show the benefits of the proposed algorithm for message log reclamation.
Austin, Peter C; Steyerberg, Ewout W
2012-06-20
When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.
Log-amplitude statistics for Beck-Cohen superstatistics
NASA Astrophysics Data System (ADS)
Kiyono, Ken; Konno, Hidetoshi
2013-05-01
As a possible generalization of Beck-Cohen superstatistical processes, we study non-Gaussian processes with temporal heterogeneity of local variance. To characterize the variance heterogeneity, we define log-amplitude cumulants and log-amplitude autocovariance and derive closed-form expressions of the log-amplitude cumulants for χ2, inverse χ2, and log-normal superstatistical distributions. Furthermore, we show that χ2 and inverse χ2 superstatistics with degree 2 are closely related to an extreme value distribution, called the Gumbel distribution. In these cases, the corresponding superstatistical distributions result in the q-Gaussian distribution with q=5/3 and the bilateral exponential distribution, respectively. Thus, our finding provides a hypothesis that the asymptotic appearance of these two special distributions may be explained by a link with the asymptotic limit distributions involving extreme values. In addition, as an application of our approach, we demonstrated that non-Gaussian fluctuations observed in a stock index futures market can be well approximated by the χ2 superstatistical distribution with degree 2.
NASA Astrophysics Data System (ADS)
Matsubara, Yoshitsugu; Musashi, Yasuo
2017-12-01
The purpose of this study is to explain fluctuations in email size. We have previously investigated the long-term correlations between email send requests and data flow in the system log of the primary staff email server at a university campus, finding that email size frequency follows a power-law distribution with two inflection points, and that the power-law property weakens the correlation of the data flow. However, the mechanism underlying this fluctuation is not completely understood. We collected new log data from both staff and students over six academic years and analyzed the frequency distribution thereof, focusing on the type of content contained in the emails. Furthermore, we obtained permission to collect "Content-Type" log data from the email headers. We therefore collected the staff log data from May 1, 2015 to July 31, 2015, creating two subdistributions. In this paper, we propose a model to explain these subdistributions, which follow log-normal-like distributions. In the log-normal-like model, email senders -consciously or unconsciously- regulate the size of new email sentences according to a normal distribution. The fitting of the model is acceptable for these subdistributions, and the model demonstrates power-law properties for large email sizes. An analysis of the length of new email sentences would be required for further discussion of our model; however, to protect user privacy at the participating organization, we left this analysis for future work. This study provides new knowledge on the properties of email sizes, and our model is expected to contribute to the decision on whether to establish upper size limits in the design of email services.
Size distribution of submarine landslides along the U.S. Atlantic margin
Chaytor, J.D.; ten Brink, Uri S.; Solow, A.R.; Andrews, B.D.
2009-01-01
Assessment of the probability for destructive landslide-generated tsunamis depends on the knowledge of the number, size, and frequency of large submarine landslides. This paper investigates the size distribution of submarine landslides along the U.S. Atlantic continental slope and rise using the size of the landslide source regions (landslide failure scars). Landslide scars along the margin identified in a detailed bathymetric Digital Elevation Model (DEM) have areas that range between 0.89??km2 and 2410??km2 and volumes between 0.002??km3 and 179??km3. The area to volume relationship of these failure scars is almost linear (inverse power-law exponent close to 1), suggesting a fairly uniform failure thickness of a few 10s of meters in each event, with only rare, deep excavating landslides. The cumulative volume distribution of the failure scars is very well described by a log-normal distribution rather than by an inverse power-law, the most commonly used distribution for both subaerial and submarine landslides. A log-normal distribution centered on a volume of 0.86??km3 may indicate that landslides preferentially mobilize a moderate amount of material (on the order of 1??km3), rather than large landslides or very small ones. Alternatively, the log-normal distribution may reflect an inverse power law distribution modified by a size-dependent probability of observing landslide scars in the bathymetry data. If the latter is the case, an inverse power-law distribution with an exponent of 1.3 ?? 0.3, modified by a size-dependent conditional probability of identifying more failure scars with increasing landslide size, fits the observed size distribution. This exponent value is similar to the predicted exponent of 1.2 ?? 0.3 for subaerial landslides in unconsolidated material. Both the log-normal and modified inverse power-law distributions of the observed failure scar volumes suggest that large landslides, which have the greatest potential to generate damaging tsunamis, occur infrequently along the margin. ?? 2008 Elsevier B.V.
Stochastic modelling of non-stationary financial assets
NASA Astrophysics Data System (ADS)
Estevens, Joana; Rocha, Paulo; Boto, João P.; Lind, Pedro G.
2017-11-01
We model non-stationary volume-price distributions with a log-normal distribution and collect the time series of its two parameters. The time series of the two parameters are shown to be stationary and Markov-like and consequently can be modelled with Langevin equations, which are derived directly from their series of values. Having the evolution equations of the log-normal parameters, we reconstruct the statistics of the first moments of volume-price distributions which fit well the empirical data. Finally, the proposed framework is general enough to study other non-stationary stochastic variables in other research fields, namely, biology, medicine, and geology.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
NASA Astrophysics Data System (ADS)
Faruk, Alfensi
2018-03-01
Survival analysis is a branch of statistics, which is focussed on the analysis of time- to-event data. In multivariate survival analysis, the proportional hazards (PH) is the most popular model in order to analyze the effects of several covariates on the survival time. However, the assumption of constant hazards in PH model is not always satisfied by the data. The violation of the PH assumption leads to the misinterpretation of the estimation results and decreasing the power of the related statistical tests. On the other hand, the accelerated failure time (AFT) models do not assume the constant hazards in the survival data as in PH model. The AFT models, moreover, can be used as the alternative to PH model if the constant hazards assumption is violated. The objective of this research was to compare the performance of PH model and the AFT models in analyzing the significant factors affecting the first birth interval (FBI) data in Indonesia. In this work, the discussion was limited to three AFT models which were based on Weibull, exponential, and log-normal distribution. The analysis by using graphical approach and a statistical test showed that the non-proportional hazards exist in the FBI data set. Based on the Akaike information criterion (AIC), the log-normal AFT model was the most appropriate model among the other considered models. Results of the best fitted model (log-normal AFT model) showed that the covariates such as women’s educational level, husband’s educational level, contraceptive knowledge, access to mass media, wealth index, and employment status were among factors affecting the FBI in Indonesia.
Alternate methods for FAAT S-curve generation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaufman, A.M.
The FAAT (Foreign Asset Assessment Team) assessment methodology attempts to derive a probability of effect as a function of incident field strength. The probability of effect is the likelihood that the stress put on a system exceeds its strength. In the FAAT methodology, both the stress and strength are random variables whose statistical properties are estimated by experts. Each random variable has two components of uncertainty: systematic and random. The systematic uncertainty drives the confidence bounds in the FAAT assessment. Its variance can be reduced by improved information. The variance of the random uncertainty is not reducible. The FAAT methodologymore » uses an assessment code called ARES to generate probability of effect curves (S-curves) at various confidence levels. ARES assumes log normal distributions for all random variables. The S-curves themselves are log normal cumulants associated with the random portion of the uncertainty. The placement of the S-curves depends on confidence bounds. The systematic uncertainty in both stress and strength is usually described by a mode and an upper and lower variance. Such a description is not consistent with the log normal assumption of ARES and an unsatisfactory work around solution is used to obtain the required placement of the S-curves at each confidence level. We have looked into this situation and have found that significant errors are introduced by this work around. These errors are at least several dB-W/cm{sup 2} at all confidence levels, but they are especially bad in the estimate of the median. In this paper, we suggest two alternate solutions for the placement of S-curves. To compare these calculational methods, we have tabulated the common combinations of upper and lower variances and generated the relevant S-curves offsets from the mode difference of stress and strength.« less
The Adaptation of the Moth Pheromone Receptor Neuron to its Natural Stimulus
NASA Astrophysics Data System (ADS)
Kostal, Lubomir; Lansky, Petr; Rospars, Jean-Pierre
2008-07-01
We analyze the first phase of information transduction in the model of the olfactory receptor neuron of the male moth Antheraea polyphemus. We predict such stimulus characteristics that enable the system to perform optimally, i.e., to transfer as much information as possible. Few a priori constraints on the nature of stimulus and stimulus-to-signal transduction are assumed. The results are given in terms of stimulus distributions and intermittency factors which makes direct comparison with experimental data possible. Optimal stimulus is approximatelly described by exponential or log-normal probability density function which is in agreement with experiment and the predicted intermittency factors fall within the lowest range of observed values. The results are discussed with respect to electroantennogram measurements and behavioral observations.
Estimation of Renyi exponents in random cascades
Troutman, Brent M.; Vecchia, Aldo V.
1999-01-01
We consider statistical estimation of the Re??nyi exponent ??(h), which characterizes the scaling behaviour of a singular measure ?? defined on a subset of Rd. The Re??nyi exponent is defined to be lim?????0 [{log M??(h)}/(-log ??)], assuming that this limit exists, where M??(h) = ??i??h(??i) and, for ??>0, {??i} are the cubes of a ??-coordinate mesh that intersect the support of ??. In particular, we demonstrate asymptotic normality of the least-squares estimator of ??(h) when the measure ?? is generated by a particular class of multiplicative random cascades, a result which allows construction of interval estimates and application of hypothesis tests for this scaling exponent. Simulation results illustrating this asymptotic normality are presented. ?? 1999 ISI/BS.
Medium Access Control for Opportunistic Concurrent Transmissions under Shadowing Channels
Son, In Keun; Mao, Shiwen; Hur, Seung Min
2009-01-01
We study the problem of how to alleviate the exposed terminal effect in multi-hop wireless networks in the presence of log-normal shadowing channels. Assuming node location information, we propose an extension of the IEEE 802.11 MAC protocol that sched-ules concurrent transmissions in the presence of log-normal shadowing, thus mitigating the exposed terminal problem and improving network throughput and delay performance. We observe considerable improvements in throughput and delay achieved over the IEEE 802.11 MAC under various network topologies and channel conditions in ns-2 simulations, which justify the importance of considering channel randomness in MAC protocol design for multi-hop wireless networks. PMID:22408556
Determining prescription durations based on the parametric waiting time distribution.
Støvring, Henrik; Pottegård, Anton; Hallas, Jesper
2016-12-01
The purpose of the study is to develop a method to estimate the duration of single prescriptions in pharmacoepidemiological studies when the single prescription duration is not available. We developed an estimation algorithm based on maximum likelihood estimation of a parametric two-component mixture model for the waiting time distribution (WTD). The distribution component for prevalent users estimates the forward recurrence density (FRD), which is related to the distribution of time between subsequent prescription redemptions, the inter-arrival density (IAD), for users in continued treatment. We exploited this to estimate percentiles of the IAD by inversion of the estimated FRD and defined the duration of a prescription as the time within which 80% of current users will have presented themselves again. Statistical properties were examined in simulation studies, and the method was applied to empirical data for four model drugs: non-steroidal anti-inflammatory drugs (NSAIDs), warfarin, bendroflumethiazide, and levothyroxine. Simulation studies found negligible bias when the data-generating model for the IAD coincided with the FRD used in the WTD estimation (Log-Normal). When the IAD consisted of a mixture of two Log-Normal distributions, but was analyzed with a single Log-Normal distribution, relative bias did not exceed 9%. Using a Log-Normal FRD, we estimated prescription durations of 117, 91, 137, and 118 days for NSAIDs, warfarin, bendroflumethiazide, and levothyroxine, respectively. Similar results were found with a Weibull FRD. The algorithm allows valid estimation of single prescription durations, especially when the WTD reliably separates current users from incident users, and may replace ad-hoc decision rules in automated implementations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Berthet, Gwenaël; Renard, Jean-Baptiste; Brogniez, Colette; Robert, Claude; Chartier, Michel; Pirre, Michel
2002-12-01
Aerosol extinction coefficients have been derived in the 375-700-nm spectral domain from measurements in the stratosphere since 1992, at night, at mid- and high latitudes from 15 to 40 km, by two balloonborne spectrometers, Absorption par les Minoritaires Ozone et NOx (AMON) and Spectroscopie d'Absorption Lunaire pour l'Observation des Minoritaires Ozone et NOx (SALOMON). Log-normal size distributions associated with the Mie-computed extinction spectra that best fit the measurements permit calculation of integrated properties of the distributions. Although measured extinction spectra that correspond to background aerosols can be reproduced by the Mie scattering model by use of monomodal log-normal size distributions, each flight reveals some large discrepancies between measurement and theory at several altitudes. The agreement between measured and Mie-calculated extinction spectra is significantly improved by use of bimodal log-normal distributions. Nevertheless, neither monomodal nor bimodal distributions permit correct reproduction of some of the measured extinction shapes, especially for the 26 February 1997 AMON flight, which exhibited spectral behavior attributed to particles from a polar stratospheric cloud event.
Distribution of Plasmoids in Post-Coronal Mass Ejection Current Sheets
NASA Astrophysics Data System (ADS)
Bhattacharjee, A.; Guo, L.; Huang, Y.
2013-12-01
Recently, the fragmentation of a current sheet in the high-Lundquist-number regime caused by the plasmoid instability has been proposed as a possible mechanism for fast reconnection. In this work, we investigate this scenario by comparing the distribution of plasmoids obtained from Large Angle and Spectrometric Coronagraph (LASCO) observational data of a coronal mass ejection event with a resistive magnetohydrodynamic simulation of a similar event. The LASCO/C2 data are analyzed using visual inspection, whereas the numerical data are analyzed using both visual inspection and a more precise topological method. Contrasting the observational data with numerical data analyzed with both methods, we identify a major limitation of the visual inspection method, due to the difficulty in resolving smaller plasmoids. This result raises questions about reports of log-normal distributions of plasmoids and other coherent features in the recent literature. Based on nonlinear scaling relations of the plasmoid instability, we infer a lower bound on the current sheet width, assuming the underlying mechanism of current sheet broadening is resistive diffusion.
Photoballistics of volcanic jet activity at Stromboli, Italy
NASA Technical Reports Server (NTRS)
Chouet, B.; Hamisevicz, N.; Mcgetchin, T. R.
1974-01-01
Two night eruptions of the volcano Stromboli were studied through 70-mm photography. Single-camera techniques were used. Particle sphericity, constant velocity in the frame, and radial symmetry were assumed. Properties of the particulate phase found through analysis include: particle size, velocity, total number of particles ejected, angular dispersion and distribution in the jet, time variation of particle size and apparent velocity distribution, averaged volume flux, and kinetic energy carried by the condensed phase. The frequency distributions of particle size and apparent velocities are found to be approximately log normal. The properties of the gas phase were inferred from the fact that it was the transporting medium for the condensed phase. Gas velocity and time variation, volume flux of gas, dynamic pressure, mass erupted, and density were estimated. A CO2-H2O mixture is possible for the observed eruptions. The flow was subsonic. Velocity variations may be explained by an organ pipe resonance. Particle collimation may be produced by a Magnus effect.
Bowker, Matthew A.; Maestre, Fernando T.
2012-01-01
Dryland vegetation is inherently patchy. This patchiness goes on to impact ecology, hydrology, and biogeochemistry. Recently, researchers have proposed that dryland vegetation patch sizes follow a power law which is due to local plant facilitation. It is unknown what patch size distribution prevails when competition predominates over facilitation, or if such a pattern could be used to detect competition. We investigated this question in an alternative vegetation type, mosses and lichens of biological soil crusts, which exhibit a smaller scale patch-interpatch configuration. This micro-vegetation is characterized by competition for space. We proposed that multiplicative effects of genetics, environment and competition should result in a log-normal patch size distribution. When testing the prevalence of log-normal versus power law patch size distributions, we found that the log-normal was the better distribution in 53% of cases and a reasonable fit in 83%. In contrast, the power law was better in 39% of cases, and in 8% of instances both distributions fit equally well. We further hypothesized that the log-normal distribution parameters would be predictably influenced by competition strength. There was qualitative agreement between one of the distribution's parameters (μ) and a novel intransitive (lacking a 'best' competitor) competition index, suggesting that as intransitivity increases, patch sizes decrease. The correlation of μ with other competition indicators based on spatial segregation of species (the C-score) depended on aridity. In less arid sites, μ was negatively correlated with the C-score (suggesting smaller patches under stronger competition), while positive correlations (suggesting larger patches under stronger competition) were observed at more arid sites. We propose that this is due to an increasing prevalence of competition transitivity as aridity increases. These findings broaden the emerging theory surrounding dryland patch size distributions and, with refinement, may help us infer cryptic ecological processes from easily observed spatial patterns in the field.
Assessment of the hygienic performances of hamburger patty production processes.
Gill, C O; Rahn, K; Sloan, K; McMullen, L M
1997-05-20
The hygienic conditions of the hamburger patties collected from three patty manufacturing plants and six retail outlets were examined. At each manufacturing plant a sample from newly formed, chilled patties and one from frozen patties were collected from each of 25 batches of patties selected at random. At three, two or one retail outlet, respectively, 25 samples from frozen, chilled or both frozen and chilled patties were collected at random. Each sample consisted of 30 g of meat obtained from five or six patties. Total aerobic, coliform and Escherichia coli counts per gram were enumerated for each sample. The mean log (x) and standard deviation (s) were calculated for the log10 values for each set of 25 counts, on the assumption that the distribution of counts approximated the log normal. A value for the log10 of the arithmetic mean (log A) was calculated for each set from the values of x and s. A chi2 statistic was calculated for each set as a test of the assumption of the log normal distribution. The chi2 statistic was calculable for 32 of the 39 sets. Four of the sets gave chi2 values indicative of gross deviation from log normality. On inspection of those sets, distributions obviously differing from the log normal were apparent in two. Log A values for total, coliform and E. coli counts for chilled patties from manufacturing plants ranged from 4.4 to 5.1, 1.7 to 2.3 and 0.9 to 1.5, respectively. Log A values for frozen patties from manufacturing plants were between < 0.1 and 0.5 log10 units less than the equivalent values for chilled patties. Log A values for total, coliform and E. coli counts for frozen patties on retail sale ranged from 3.8 to 8.5, < 0.5 to 3.6 and < 0 to 1.9, respectively. The equivalent ranges for chilled patties on retail sale were 4.8 to 8.5, 1.8 to 3.7 and 1.4 to 2.7, respectively. The findings indicate that the general hygienic condition of hamburgers patties could be improved by their being manufactured from only manufacturing beef of superior hygienic quality, and by the better management of chilled patties at retail outlets.
NASA Technical Reports Server (NTRS)
Smith, O. E.
1976-01-01
The techniques are presented to derive several statistical wind models. The techniques are from the properties of the multivariate normal probability function. Assuming that the winds can be considered as bivariate normally distributed, then (1) the wind components and conditional wind components are univariate normally distributed, (2) the wind speed is Rayleigh distributed, (3) the conditional distribution of wind speed given a wind direction is Rayleigh distributed, and (4) the frequency of wind direction can be derived. All of these distributions are derived from the 5-sample parameter of wind for the bivariate normal distribution. By further assuming that the winds at two altitudes are quadravariate normally distributed, then the vector wind shear is bivariate normally distributed and the modulus of the vector wind shear is Rayleigh distributed. The conditional probability of wind component shears given a wind component is normally distributed. Examples of these and other properties of the multivariate normal probability distribution function as applied to Cape Kennedy, Florida, and Vandenberg AFB, California, wind data samples are given. A technique to develop a synthetic vector wind profile model of interest to aerospace vehicle applications is presented.
Krishnamoorthy, K; Oral, Evrim
2017-12-01
Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.
Explorations in statistics: the log transformation.
Curran-Everett, Douglas
2018-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.
Stochastic Growth Theory of Spatially-Averaged Distributions of Langmuir Fields in Earth's Foreshock
NASA Technical Reports Server (NTRS)
Boshuizen, Christopher R.; Cairns, Iver H.; Robinson, P. A.
2001-01-01
Langmuir-like waves in the foreshock of Earth are characteristically bursty and irregular, and are the subject of a number of recent studies. Averaged over the foreshock, it is observed that the probability distribution is power-law P(bar)(log E) in the wave field E with the bar denoting this averaging over position, In this paper it is shown that stochastic growth theory (SGT) can explain a power-law spatially-averaged distributions P(bar)(log E), when the observed power-law variations of the mean and standard deviation of log E with position are combined with the log normal statistics predicted by SGT at each location.
Proposal of a method for evaluating tsunami risk using response-surface methodology
NASA Astrophysics Data System (ADS)
Fukutani, Y.
2017-12-01
Information on probabilistic tsunami inundation hazards is needed to define and evaluate tsunami risk. Several methods for calculating these hazards have been proposed (e.g. Løvholt et al. (2012), Thio (2012), Fukutani et al. (2014), Goda et al. (2015)). However, these methods are inefficient, and their calculation cost is high, since they require multiple tsunami numerical simulations, therefore lacking versatility. In this study, we proposed a simpler method for tsunami risk evaluation using response-surface methodology. Kotani et al. (2016) proposed an evaluation method for the probabilistic distribution of tsunami wave-height using a response-surface methodology. We expanded their study and developed a probabilistic distribution of tsunami inundation depth. We set the depth (x1) and the slip (x2) of an earthquake fault as explanatory variables and tsunami inundation depth (y) as an object variable. Subsequently, tsunami risk could be evaluated by conducting a Monte Carlo simulation, assuming that the generation probability of an earthquake follows a Poisson distribution, the probability distribution of tsunami inundation depth follows the distribution derived from a response-surface, and the damage probability of a target follows a log normal distribution. We applied the proposed method to a wood building located on the coast of Tokyo Bay. We implemented a regression analysis based on the results of 25 tsunami numerical calculations and developed a response-surface, which was defined as y=ax1+bx2+c (a:0.2615, b:3.1763, c=-1.1802). We assumed proper probabilistic distribution for earthquake generation, inundation height, and vulnerability. Based on these probabilistic distributions, we conducted Monte Carlo simulations of 1,000,000 years. We clarified that the expected damage probability of the studied wood building is 22.5%, assuming that an earthquake occurs. The proposed method is therefore a useful and simple way to evaluate tsunami risk using a response-surface and Monte Carlo simulation without conducting multiple tsunami numerical simulations.
2012-01-01
Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998
Best Statistical Distribution of flood variables for Johor River in Malaysia
NASA Astrophysics Data System (ADS)
Salarpour Goodarzi, M.; Yusop, Z.; Yusof, F.
2012-12-01
A complex flood event is always characterized by a few characteristics such as flood peak, flood volume, and flood duration, which might be mutually correlated. This study explored the statistical distribution of peakflow, flood duration and flood volume at Rantau Panjang gauging station on the Johor River in Malaysia. Hourly data were recorded for 45 years. The data were analysed based on water year (July - June). Five distributions namely, Log Normal, Generalize Pareto, Log Pearson, Normal and Generalize Extreme Value (GEV) were used to model the distribution of all the three variables. Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests were used to evaluate the best fit. Goodness-of-fit tests at 5% level of significance indicate that all the models can be used to model the distribution of peakflow, flood duration and flood volume. However, Generalize Pareto distribution is found to be the most suitable model when tested with the Anderson-Darling test and the, Kolmogorov-Smirnov suggested that GEV is the best for peakflow. The result of this research can be used to improve flood frequency analysis. Comparison between Generalized Extreme Value, Generalized Pareto and Log Pearson distributions in the Cumulative Distribution Function of peakflow
1993-06-01
1 A. OBJECTIVES ............. .... .................. 1 B. HISTORY ................... .................... 2 C...utilization, and any additional manpower requirements at the "selected" AIMD’s. B. HISTORY Until late 1991 both NADEP JAX and NADEP North Island (NORIS...TRIANGULAR OR ALL LOG NORMAL DISTRIBUTIONS FOR SERVICE TIMES AT AIND CECIL FIELD maintenance/ Triangular Log Normal MAZDA Difference Differe•ce Supply
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.; Marino, J. T., Jr.
1974-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-emperical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. Bit error probabilities for non-optimum threshold detection system were also investigated.
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.
1975-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-empirical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. The bit error probabilities for nonoptimum threshold detection systems were also investigated.
VizieR Online Data Catalog: Double stars with wide separations in the AGK3 (Halbwachs+, 2016)
NASA Astrophysics Data System (ADS)
Halbwachs, J. L.; Mayor, M.; Udry, S.
2016-10-01
A large list of common proper motion stars selected from the third Astronomischen Gesellschaft Katalog (AGK3) was monitored with the CORAVEL (for COrrelation RAdial VELocities) spectrovelocimeter, in order to prepare a sample of physical binaries with very wide separations. In paper I,66 stars received special attention, since their radial velocities (RV) seemed to be variable. These stars were monitored over several years in order to derive the elements of their spectroscopic orbits. In addition, 10 of them received accurate RV measurements from the SOPHIE spectrograph of the T193 telescope at the Observatory of Haute-Provence. For deriving the orbital elements of double-lined spectroscopic binaries (SB2s), a new method was applied, which assumed that the RV of blended measurements are linear combinations of the RV of the components. 13 SB2 orbits were thus calculated. The orbital elements were eventually obtained for 52 spectroscopic binaries (SBs), two of them making a triple system. 40 SBs received their first orbit and the orbital elements were improved for 10 others. In addition, 11 SBs were discovered with very long periods for which the orbital parameters were not found. It appeared that HD 153252 has a close companion, which is a candidate brown dwarf with a minimum mass of 50 Jupiter masses. In paper II, 80 wide binaries (WBs) were detected, and 39 optical pairs were identified. Adding CPM stars with separations close enough to be almost certain they are physical, a "bias-controlled" sample of 116 wide binaries was obtained, and used to derive the distribution of separations from 100 to 30,000 au. The distribution obtained doesn't match the log-constant distribution, but is in agreement with the log-normal distribution. The spectroscopic binaries detected among the WB components were used to derive statistical informations about the multiple systems. The close binaries in WBs seem to be similar to those detected in other field stars. As for the WBs, they seem to obey the log-normal distribution of periods. The number of quadruple systems is in agreement with the "no correlation" hypothesis; this indicates that an environment conducive to the formation of WBs doesn't favor the formation of subsystems with periods shorter than 10 years. (9 data files).
Erosion associated with cable and tractor logging in northwestern California
R. M. Rice; P. A. Datzman
1981-01-01
Abstract - Erosion and site conditions were measured at 102 logged plots in northwestern California. Erosion averaged 26.8 m 3 /ha. A log-normal distribution was a better fit to the data. The antilog of the mean of the logarithms of erosion was 3.2 m 3 /ha. The Coast District Erosion Hazard Rating was a poor predictor of erosion related to logging. In a new equation...
A Model for Hydraulic Properties Based on Angular Pores with Lognormal Size Distribution
NASA Astrophysics Data System (ADS)
Durner, W.; Diamantopoulos, E.
2014-12-01
Soil water retention and unsaturated hydraulic conductivity curves are mandatory for modeling water flow in soils. It is a common approach to measure few points of the water retention curve and to calculate the hydraulic conductivity curve by assuming that the soil can be represented as a bundle of capillary tubes. Both curves are then used to predict water flow at larger spatial scales. However, the predictive power of these curves is often very limited. This can be very easily illustrated if we measure the soil hydraulic properties (SHPs) for a drainage experiment and then use these properties to predict the water flow in the case of imbibition. Further complications arise from the incomplete wetting of water at the solid matrix which results in finite values of the contact angles between the solid-water-air interfaces. To address these problems we present a physically-based model for hysteretic SHPs. This model is based on bundles of angular pores. Hysteresis for individual pores is caused by (i) different snap-off pressures during filling and emptying of single angular pores and (ii) by different advancing and receding contact angles for fluids that are not perfectly wettable. We derive a model of hydraulic conductivity as a function of contact angle by assuming flow perpendicular to pore cross sections and present closed-form expressions for both the sample scale water retention and hydraulic conductivity function by assuming a log-normal statistical distribution of pore size. We tested the new model against drainage and imbibition experiments for various sandy materials which were conducted with various liquids of differing wettability. The model described both imbibition and drainage experiments very well by assuming a unique pore size distribution of the sample and a zero contact angle for the perfectly wetting liquid. Eventually, we see the possibility to relate the particle size distribution with a model which describes the SHPs.
Double stars with wide separations in the AGK3 - II. The wide binaries and the multiple systems*
NASA Astrophysics Data System (ADS)
Halbwachs, J.-L.; Mayor, M.; Udry, S.
2017-02-01
A large observation programme was carried out to measure the radial velocities of the components of a selection of common proper motion (CPM) stars to select the physical binaries. 80 wide binaries (WBs) were detected, and 39 optical pairs were identified. By adding CPM stars with separations close enough to be almost certain that they are physical, a bias-controlled sample of 116 WBs was obtained, and used to derive the distribution of separations from 100 to 30 000 au. The distribution obtained does not match the log-constant distribution, but agrees with the log-normal distribution. The spectroscopic binaries detected among the WB components were used to derive statistical information about the multiple systems. The close binaries in WBs seem to be like those detected in other field stars. As for the WBs, they seem to obey the log-normal distribution of periods. The number of quadruple systems agrees with the no correlation hypothesis; this indicates that an environment conducive to the formation of WBs does not favour the formation of subsystems with periods shorter than 10 yr.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
Energetics and Birth Rates of Supernova Remnants in the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Leahy, D. A.
2017-03-01
Published X-ray emission properties for a sample of 50 supernova remnants (SNRs) in the Large Magellanic Cloud (LMC) are used as input for SNR evolution modeling calculations. The forward shock emission is modeled to obtain the initial explosion energy, age, and circumstellar medium density for each SNR in the sample. The resulting age distribution yields a SNR birthrate of 1/(500 yr) for the LMC. The explosion energy distribution is well fit by a log-normal distribution, with a most-probable explosion energy of 0.5× {10}51 erg, with a 1σ dispersion by a factor of 3 in energy. The circumstellar medium density distribution is broader than the explosion energy distribution, with a most-probable density of ˜0.1 cm-3. The shape of the density distribution can be fit with a log-normal distribution, with incompleteness at high density caused by the shorter evolution times of SNRs.
Estimating macroporosity in a forest watershed by use of a tension infiltrometer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Watson, K.W.; Luxmoore, R.J.
The ability to obtain sufficient field hydrologic data at reasonable cost can be an important limiting factor in applying transport models. A procedure is described for using ponded-flow- and tension-infiltration measurements to calculate transport parameters in a forest watershed. Thirty infiltration measurements were taken under ponded-flow conditions and at 3, 6, and 15 cm (H/sub 2/O) tension. It was assumed from capillarity theory that pores > 0.1-, 0.05-, and 0.02-cm diam, respectively, were excluded from the transport process during the tension infiltration measurements. Under ponded flow, 73% of the flux was conducted through macropores (i.e., pores > 0.1-cm diam.). Anmore » estimated 96% of the water flux was transmitted through only 0.32% of the soil volume. In general the larger the total water flux the larger the macropore contribution to total water flux. The Shapiro-Wilk normality test indicated that water flux through both matrix pore space and macropores was log-normally distributed in space.« less
SIMULATED HUMAN ERROR PROBABILITY AND ITS APPLICATION TO DYNAMIC HUMAN FAILURE EVENTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herberger, Sarah M.; Boring, Ronald L.
Abstract Objectives: Human reliability analysis (HRA) methods typically analyze human failure events (HFEs) at the overall task level. For dynamic HRA, it is important to model human activities at the subtask level. There exists a disconnect between dynamic subtask level and static task level that presents issues when modeling dynamic scenarios. For example, the SPAR-H method is typically used to calculate the human error probability (HEP) at the task level. As demonstrated in this paper, quantification in SPAR-H does not translate to the subtask level. Methods: Two different discrete distributions were generated for each SPAR-H Performance Shaping Factor (PSF) tomore » define the frequency of PSF levels. The first distribution was a uniform, or uninformed distribution that assumed the frequency of each PSF level was equally likely. The second non-continuous distribution took the frequency of PSF level as identified from an assessment of the HERA database. These two different approaches were created to identify the resulting distribution of the HEP. The resulting HEP that appears closer to the known distribution, a log-normal centered on 1E-3, is the more desirable. Each approach then has median, average and maximum HFE calculations applied. To calculate these three values, three events, A, B and C are generated from the PSF level frequencies comprised of subtasks. The median HFE selects the median PSF level from each PSF and calculates HEP. The average HFE takes the mean PSF level, and the maximum takes the maximum PSF level. The same data set of subtask HEPs yields starkly different HEPs when aggregated to the HFE level in SPAR-H. Results: Assuming that each PSF level in each HFE is equally likely creates an unrealistic distribution of the HEP that is centered at 1. Next the observed frequency of PSF levels was applied with the resulting HEP behaving log-normally with a majority of the values under 2.5% HEP. The median, average and maximum HFE calculations did yield different answers for the HFE. The HFE maximum grossly over estimates the HFE, while the HFE distribution occurs less than HFE median, and greater than HFE average. Conclusions: Dynamic task modeling can be perused through the framework of SPAR-H. Identification of distributions associated with each PSF needs to be defined, and may change depending upon the scenario. However it is very unlikely that each PSF level is equally likely as the resulting HEP distribution is strongly centered at 100%, which is unrealistic. Other distributions may need to be identified for PSFs, to facilitate the transition to dynamic task modeling. Additionally discrete distributions need to be exchanged for continuous so that simulations for the HFE can further advance. This paper provides a method to explore dynamic subtask to task translation and provides examples of the process using the SPAR-H method.« less
Distribution of runup heights of the December 26, 2004 tsunami in the Indian Ocean
NASA Astrophysics Data System (ADS)
Choi, Byung Ho; Hong, Sung Jin; Pelinovsky, Efim
2006-07-01
A massive earthquake with magnitude 9.3 occurred on December 26, 2004 off the northern Sumatra generated huge tsunami waves affected many coastal countries in the Indian Ocean. A number of field surveys have been performed after this tsunami event; in particular, several surveys in the south/east coast of India, Andaman and Nicobar Islands, Sri Lanka, Sumatra, Malaysia, and Thailand have been organized by the Korean Society of Coastal and Ocean Engineers from January to August 2005. Spatial distribution of the tsunami runup is used to analyze the distribution function of the wave heights on different coasts. Theoretical interpretation of this distribution is associated with random coastal bathymetry and coastline led to the log-normal functions. Observed data also are in a very good agreement with log-normal distribution confirming the important role of the variable ocean bathymetry in the formation of the irregular wave height distribution along the coasts.
An estimate of field size distributions for selected sites in the major grain producing countries
NASA Technical Reports Server (NTRS)
Podwysocki, M. H.
1977-01-01
The field size distributions for the major grain producing countries of the World were estimated. LANDSAT-1 and 2 images were evaluated for two areas each in the United States, People's Republic of China, and the USSR. One scene each was evaluated for France, Canada, and India. Grid sampling was done for representative sub-samples of each image, measuring the long and short axes of each field; area was then calculated. Each of the resulting data sets was computer analyzed for their frequency distributions. Nearly all frequency distributions were highly peaked and skewed (shifted) towards small values, approaching that of either a Poisson or log-normal distribution. The data were normalized by a log transformation, creating a Gaussian distribution which has moments readily interpretable and useful for estimating the total population of fields. Resultant predictors of the field size estimates are discussed.
The missing impact craters on Venus
NASA Technical Reports Server (NTRS)
Speidel, D. H.
1993-01-01
The size-frequency pattern of the 842 impact craters on Venus measured to date can be well described (across four standard deviation units) as a single log normal distribution with a mean crater diameter of 14.5 km. This result was predicted in 1991 on examination of the initial Magellan analysis. If this observed distribution is close to the real distribution, the 'missing' 90 percent of the small craters and the 'anomalous' lack of surface splotches may thus be neither missing nor anomalous. I think that the missing craters and missing splotches can be satisfactorily explained by accepting that the observed distribution approximates the real one, that it is not craters that are missing but the impactors. What you see is what you got. The implication that Venus crossing impactors would have the same type of log normal distribution is consistent with recently described distribution for terrestrial craters and Earth crossing asteroids.
Characterizing Topology of Probabilistic Biological Networks.
Todor, Andrei; Dobra, Alin; Kahveci, Tamer
2013-09-06
Biological interactions are often uncertain events, that may or may not take place with some probability. Existing studies analyze the degree distribution of biological networks by assuming that all the given interactions take place under all circumstances. This strong and often incorrect assumption can lead to misleading results. Here, we address this problem and develop a sound mathematical basis to characterize networks in the presence of uncertain interactions. We develop a method that accurately describes the degree distribution of such networks. We also extend our method to accurately compute the joint degree distributions of node pairs connected by edges. The number of possible network topologies grows exponentially with the number of uncertain interactions. However, the mathematical model we develop allows us to compute these degree distributions in polynomial time in the number of interactions. It also helps us find an adequate mathematical model using maximum likelihood estimation. Our results demonstrate that power law and log-normal models best describe degree distributions for probabilistic networks. The inverse correlation of degrees of neighboring nodes shows that, in probabilistic networks, nodes with large number of interactions prefer to interact with those with small number of interactions more frequently than expected.
NASA Astrophysics Data System (ADS)
Dong, Yijun
The research about measuring the risk of a bond portfolio and the portfolio optimization was relatively rare previously, because the risk factors of bond portfolios are not very volatile. However, this condition has changed recently. The 2008 financial crisis brought high volatility to the risk factors and the related bond securities, even if the highly rated U.S. treasury bonds. Moreover, the risk factors of bond portfolios show properties of fat-tailness and asymmetry like risk factors of equity portfolios. Therefore, we need to use advanced techniques to measure and manage risk of bond portfolios. In our paper, we first apply autoregressive moving average generalized autoregressive conditional heteroscedasticity (ARMA-GARCH) model with multivariate normal tempered stable (MNTS) distribution innovations to predict risk factors of U.S. treasury bonds and statistically demonstrate that MNTS distribution has the ability to capture the properties of risk factors based on the goodness-of-fit tests. Then based on empirical evidence, we find that the VaR and AVaR estimated by assuming normal tempered stable distribution are more realistic and reliable than those estimated by assuming normal distribution, especially for the financial crisis period. Finally, we use the mean-risk portfolio optimization to minimize portfolios' potential risks. The empirical study indicates that the optimized bond portfolios have better risk-adjusted performances than the benchmark portfolios for some periods. Moreover, the optimized bond portfolios obtained by assuming normal tempered stable distribution have improved performances in comparison to the optimized bond portfolios obtained by assuming normal distribution.
The retest distribution of the visual field summary index mean deviation is close to normal.
Anderson, Andrew J; Cheng, Allan C Y; Lau, Samantha; Le-Pham, Anne; Liu, Victor; Rahman, Farahnaz
2016-09-01
When modelling optimum strategies for how best to determine visual field progression in glaucoma, it is commonly assumed that the summary index mean deviation (MD) is normally distributed on repeated testing. Here we tested whether this assumption is correct. We obtained 42 reliable 24-2 Humphrey Field Analyzer SITA standard visual fields from one eye of each of five healthy young observers, with the first two fields excluded from analysis. Previous work has shown that although MD variability is higher in glaucoma, the shape of the MD distribution is similar to that found in normal visual fields. A Shapiro-Wilks test determined any deviation from normality. Kurtosis values for the distributions were also calculated. Data from each observer passed the Shapiro-Wilks normality test. Bootstrapped 95% confidence intervals for kurtosis encompassed the value for a normal distribution in four of five observers. When examined with quantile-quantile plots, distributions were close to normal and showed no consistent deviations across observers. The retest distribution of MD is not significantly different from normal in healthy observers, and so is likely also normally distributed - or nearly so - in those with glaucoma. Our results increase our confidence in the results of influential modelling studies where a normal distribution for MD was assumed. © 2016 The Authors Ophthalmic & Physiological Optics © 2016 The College of Optometrists.
Load-Based Lower Neck Injury Criteria for Females from Rear Impact from Cadaver Experiments.
Yoganandan, Narayan; Pintar, Frank A; Banerjee, Anjishnu
2017-05-01
The objectives of this study were to derive lower neck injury metrics/criteria and injury risk curves for the force, moment, and interaction criterion in rear impacts for females. Biomechanical data were obtained from previous intact and isolated post mortem human subjects and head-neck complexes subjected to posteroanterior accelerative loading. Censored data were used in the survival analysis model. The primary shear force, sagittal bending moment, and interaction (lower neck injury criterion, LN ic ) metrics were significant predictors of injury. The most optimal distribution was selected (Weibulll, log normal, or log logistic) using the Akaike information criterion according to the latest ISO recommendations for deriving risk curves. The Kolmogorov-Smirnov test was used to quantify robustness of the assumed parametric model. The intercepts for the interaction index were extracted from the primary risk curves. Normalized confidence interval sizes (NCIS) were reported at discrete probability levels, along with the risk curves and 95% confidence intervals. The mean force of 214 N, moment of 54 Nm, and 0.89 LN ic were associated with a five percent probability of injury. The NCIS for these metrics were 0.90, 0.95, and 0.85. These preliminary results can be used as a first step in the definition of lower neck injury criteria for women under posteroanterior accelerative loading in crashworthiness evaluations.
Czopyk, L; Olko, P
2006-01-01
The analytical model of Xapsos used for calculating microdosimetric spectra is based on the observation that straggling of energy loss can be approximated by a log-normal distribution of energy deposition. The model was applied to calculate microdosimetric spectra in spherical targets of nanometer dimensions from heavy ions at energies between 0.3 and 500 MeV amu(-1). We recalculated the originally assumed 1/E(2) initial delta electrons spectrum by applying the Continuous Slowing Down Approximation for secondary electrons. We also modified the energy deposition from electrons of energy below 100 keV, taking into account the effective path length of the scattered electrons. Results of our model calculations agree favourably with results of Monte Carlo track structure simulations using MOCA-14 for light ions (Z = 1-8) of energy ranging from E = 0.3 to 10.0 MeV amu(-1) as well as with results of Nikjoo for a wall-less proportional counter (Z = 18).
NASA Astrophysics Data System (ADS)
Wang, WenBin; Wu, ZiNiu; Wang, ChunFeng; Hu, RuiFeng
2013-11-01
A model based on a thermodynamic approach is proposed for predicting the dynamics of communicable epidemics assumed to be governed by controlling efforts of multiple scales so that an entropy is associated with the system. All the epidemic details are factored into a single and time-dependent coefficient, the functional form of this coefficient is found through four constraints, including notably the existence of an inflexion point and a maximum. The model is solved to give a log-normal distribution for the spread rate, for which a Shannon entropy can be defined. The only parameter, that characterizes the width of the distribution function, is uniquely determined through maximizing the rate of entropy production. This entropy-based thermodynamic (EBT) model predicts the number of hospitalized cases with a reasonable accuracy for SARS in the year 2003. This EBT model can be of use for potential epidemics such as avian influenza and H7N9 in China.
A New Closed Form Approximation for BER for Optical Wireless Systems in Weak Atmospheric Turbulence
NASA Astrophysics Data System (ADS)
Kaushik, Rahul; Khandelwal, Vineet; Jain, R. C.
2018-04-01
Weak atmospheric turbulence condition in an optical wireless communication (OWC) is captured by log-normal distribution. The analytical evaluation of average bit error rate (BER) of an OWC system under weak turbulence is intractable as it involves the statistical averaging of Gaussian Q-function over log-normal distribution. In this paper, a simple closed form approximation for BER of OWC system under weak turbulence is given. Computation of BER for various modulation schemes is carried out using proposed expression. The results obtained using proposed expression compare favorably with those obtained using Gauss-Hermite quadrature approximation and Monte Carlo Simulations.
A log-sinh transformation for data normalization and variance stabilization
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Shrestha, D. L.; Robertson, D. E.; Pokhrel, P.
2012-05-01
When quantifying model prediction uncertainty, it is statistically convenient to represent model errors that are normally distributed with a constant variance. The Box-Cox transformation is the most widely used technique to normalize data and stabilize variance, but it is not without limitations. In this paper, a log-sinh transformation is derived based on a pattern of errors commonly seen in hydrological model predictions. It is suited to applications where prediction variables are positively skewed and the spread of errors is seen to first increase rapidly, then slowly, and eventually approach a constant as the prediction variable becomes greater. The log-sinh transformation is applied in two case studies, and the results are compared with one- and two-parameter Box-Cox transformations.
Crépet, Amélie; Albert, Isabelle; Dervin, Catherine; Carlin, Frédéric
2007-01-01
A normal distribution and a mixture model of two normal distributions in a Bayesian approach using prevalence and concentration data were used to establish the distribution of contamination of the food-borne pathogenic bacteria Listeria monocytogenes in unprocessed and minimally processed fresh vegetables. A total of 165 prevalence studies, including 15 studies with concentration data, were taken from the scientific literature and from technical reports and used for statistical analysis. The predicted mean of the normal distribution of the logarithms of viable L. monocytogenes per gram of fresh vegetables was −2.63 log viable L. monocytogenes organisms/g, and its standard deviation was 1.48 log viable L. monocytogenes organisms/g. These values were determined by considering one contaminated sample in prevalence studies in which samples are in fact negative. This deliberate overestimation is necessary to complete calculations. With the mixture model, the predicted mean of the distribution of the logarithm of viable L. monocytogenes per gram of fresh vegetables was −3.38 log viable L. monocytogenes organisms/g and its standard deviation was 1.46 log viable L. monocytogenes organisms/g. The probabilities of fresh unprocessed and minimally processed vegetables being contaminated with concentrations higher than 1, 2, and 3 log viable L. monocytogenes organisms/g were 1.44, 0.63, and 0.17%, respectively. Introducing a sensitivity rate of 80 or 95% in the mixture model had a small effect on the estimation of the contamination. In contrast, introducing a low sensitivity rate (40%) resulted in marked differences, especially for high percentiles. There was a significantly lower estimation of contamination in the papers and reports of 2000 to 2005 than in those of 1988 to 1999 and a lower estimation of contamination of leafy salads than that of sprouts and other vegetables. The interest of the mixture model for the estimation of microbial contamination is discussed. PMID:17098926
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arutyunyan, R.V.; Bol`shov, L.A.; Vasil`ev, S.K.
1994-06-01
The objective of this study was to clarify a number of issues related to the spatial distribution of contaminants from the Chernobyl accident. The effects of local statistics were addressed by collecting and analyzing (for Cesium 137) soil samples from a number of regions, and it was found that sample activity differed by a factor of 3-5. The effect of local non-uniformity was estimated by modeling the distribution of the average activity of a set of five samples for each of the regions, with the spread in the activities for a {+-}2 range being equal to 25%. The statistical characteristicsmore » of the distribution of contamination were then analyzed and found to be a log-normal distribution with the standard deviation being a function of test area. All data for the Bryanskaya Oblast area were analyzed statistically and were adequately described by a log-normal function.« less
Petersen, Per H; Lund, Flemming; Fraser, Callum G; Sölétormos, György
2016-11-01
Background The distributions of within-subject biological variation are usually described as coefficients of variation, as are analytical performance specifications for bias, imprecision and other characteristics. Estimation of specifications required for reference change values is traditionally done using relationship between the batch-related changes during routine performance, described as Δbias, and the coefficients of variation for analytical imprecision (CV A ): the original theory is based on standard deviations or coefficients of variation calculated as if distributions were Gaussian. Methods The distribution of between-subject biological variation can generally be described as log-Gaussian. Moreover, recent analyses of within-subject biological variation suggest that many measurands have log-Gaussian distributions. In consequence, we generated a model for the estimation of analytical performance specifications for reference change value, with combination of Δbias and CV A based on log-Gaussian distributions of CV I as natural logarithms. The model was tested using plasma prolactin and glucose as examples. Results Analytical performance specifications for reference change value generated using the new model based on log-Gaussian distributions were practically identical with the traditional model based on Gaussian distributions. Conclusion The traditional and simple to apply model used to generate analytical performance specifications for reference change value, based on the use of coefficients of variation and assuming Gaussian distributions for both CV I and CV A , is generally useful.
Estimating residual fault hitting rates by recapture sampling
NASA Technical Reports Server (NTRS)
Lee, Larry; Gupta, Rajan
1988-01-01
For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
NASA Astrophysics Data System (ADS)
Jiang, Quan; Zhong, Shan; Cui, Jie; Feng, Xia-Ting; Song, Leibo
2016-12-01
We investigated the statistical characteristics and probability distribution of the mechanical parameters of natural rock using triaxial compression tests. Twenty cores of Jinping marble were tested under each different levels of confining stress (i.e., 5, 10, 20, 30, and 40 MPa). From these full stress-strain data, we summarized the numerical characteristics and determined the probability distribution form of several important mechanical parameters, including deformational parameters, characteristic strength, characteristic strains, and failure angle. The statistical proofs relating to the mechanical parameters of rock presented new information about the marble's probabilistic distribution characteristics. The normal and log-normal distributions were appropriate for describing random strengths of rock; the coefficients of variation of the peak strengths had no relationship to the confining stress; the only acceptable random distribution for both Young's elastic modulus and Poisson's ratio was the log-normal function; and the cohesive strength had a different probability distribution pattern than the frictional angle. The triaxial tests and statistical analysis also provided experimental evidence for deciding the minimum reliable number of experimental sample and for picking appropriate parameter distributions to use in reliability calculations for rock engineering.
The Ghost in the Machine: Fracking in the Earth's Complex Brittle Crust
NASA Astrophysics Data System (ADS)
Malin, P. E.
2015-12-01
This paper discusses in the impact of complex rock properties on practical applications like fracking and its associated seismic emissions. A variety of borehole measurements show that the complex physical properties of the upper crust cannot be characterized by averages on any scale. Instead they appear to follow 3 empirical rule: a power law distribution in physical scales, a lognormal distribution in populations, and a direct relation between changes in porosity and log(permeability). These rules can be directly related to the presence of fluid rich and seismically active fractures - from mineral grains to fault segments. (These are the "ghosts" referred to in the title.) In other physical systems, such behaviors arise on the boundaries of phase changes, and are studied as "critical state physics". In analogy to the 4 phases of water, crustal rocks progress upward from a un-fractured, ductile lower crust to nearly cohesionless surface alluvium. The crust in between is in an unstable transition. It is in this layer methods such as hydrofracking operate - be they in Oil and Gas, geothermal, or mining. As a result, nothing is predictable in these systems. Crustal models have conventionally been constructed assuming that in situ permeability and related properties are normally distributed. This approach is consistent with the use of short scale-length cores and logs to estimate properties. However, reservoir-scale flow data show that they are better fit to lognormal distributions. Such "long tail" distributions are observed for well productivity, ore vein grades, and induced seismic signals. Outcrop and well-log data show that many rock properties also show a power-law-type variation in scale lengths. In terms of Fourier power spectra, if peaks per km is k, then their power is proportional to 1/k. The source of this variation is related to pore-space connectivity, beginning with grain-fractures. We then show that a passive seismic method, Tomographic Fracture ImagingTM (TFI), can observe the distribution of this connectivity. Combined with TFI data, our fracture-connectivity model reveals the most significant crustal features and account for their range of passive and stimulated behaviors.
NASA Astrophysics Data System (ADS)
Ovreas, L.; Quince, C.; Sloan, W.; Lanzen, A.; Davenport, R.; Green, J.; Coulson, S.; Curtis, T.
2012-12-01
Arctic microbial soil communities are intrinsically interesting and poorly characterised. We have inferred the diversity and species abundance distribution of 6 Arctic soils: new and mature soil at the foot of a receding glacier, Arctic Semi Desert, the foot of bird cliffs and soil underlying Arctic Tundra Heath: all near Ny-Ålesund, Spitsbergen. Diversity, distribution and sample sizes were estimated using the rational method of Quince et al., (Isme Journal 2 2008:997-1006) to determine the most plausible underlying species abundance distribution. A log-normal species abundance curve was found to give a slightly better fit than an inverse Gaussian curve if, and only if, sequencing error was removed. The median estimates of diversity of operational taxonomic units (at the 3% level) were 3600-5600 (lognormal assumed) and 2825-4100 (inverse Gaussian assumed). The nature and origins of species abundance distributions are poorly understood but may yet be grasped by observing and analysing such distributions in the microbial world. The sample size required to observe the distribution (by sequencing 90% of the taxa) varied between ~ 106 and ~105 for the lognormal and inverse Gaussian respectively. We infer that between 5 and 50 GB of sequencing would be required to capture 90% or the metagenome. Though a principle components analysis clearly divided the sites into three groups there was a high (20-45%) degree of overlap in between locations irrespective of geographical proximity. Interestingly, the nearest relatives of the most abundant taxa at a number of most sites were of alpine or polar origin. Samples plotted on first two principal components together with arbitrary discriminatory OTUs
Improvement of Reynolds-Stress and Triple-Product Lag Models
NASA Technical Reports Server (NTRS)
Olsen, Michael E.; Lillard, Randolph P.
2017-01-01
The Reynolds-stress and triple product Lag models were created with a normal stress distribution which was denied by a 4:3:2 distribution of streamwise, spanwise and wall normal stresses, and a ratio of r(sub w) = 0.3k in the log layer region of high Reynolds number flat plate flow, which implies R11(+)= [4/(9/2)*.3] approximately 2.96. More recent measurements show a more complex picture of the log layer region at high Reynolds numbers. The first cut at improving these models along with the direction for future refinements is described. Comparison with recent high Reynolds number data shows areas where further work is needed, but also shows inclusion of the modeled turbulent transport terms improve the prediction where they influence the solution. Additional work is needed to make the model better match experiment, but there is significant improvement in many of the details of the log layer behavior.
210Po Log-normal distribution in human urines: Survey from Central Italy people
Sisti, D.; Rocchi, M. B. L.; Meli, M. A.; Desideri, D.
2009-01-01
The death in London of the former secret service agent Alexander Livtinenko on 23 November 2006 generally attracted the attention of the public to the rather unknown radionuclide 210Po. This paper presents the results of a monitoring programme of 210Po background levels in the urines of noncontaminated people living in Central Italy (near the Republic of S. Marino). The relationship between age, sex, years of smoking, number of cigarettes per day, and 210Po concentration was also studied. The results indicated that the urinary 210Po concentration follows a surprisingly perfect Log-normal distribution. Log 210Po concentrations were positively correlated to age (p < 0.0001), number of daily smoked cigarettes (p = 0.006), and years of smoking (p = 0.021), and associated to sex (p = 0.019). Consequently, this study provides upper reference limits for each sub-group identified by significantly predictive variables. PMID:19750019
Arihood, Leslie D.
2009-01-01
In 2005, the U.S. Geological Survey began a pilot study for the National Assessment of Water Availability and Use Program to assess the availability of water and water use in the Great Lakes Basin. Part of the study involves constructing a ground-water flow model for the Lake Michigan part of the Basin. Most ground-water flow occurs in the glacial sediments above the bedrock formations; therefore, adequate representation by the model of the horizontal and vertical hydraulic conductivity of the glacial sediments is important to the accuracy of model simulations. This work processed and analyzed well records to provide the hydrogeologic parameters of horizontal and vertical hydraulic conductivity and ground-water levels for the model layers used to simulated ground-water flow in the glacial sediments. The methods used to convert (1) lithology descriptions into assumed values of horizontal and vertical hydraulic conductivity for entire model layers, (2) aquifer-test data into point values of horizontal hydraulic conductivity, and (3) static water levels into water-level calibration data are presented. A large data set of about 458,000 well driller well logs for monitoring, observation, and water wells was available from three statewide electronic data bases to characterize hydrogeologic parameters. More than 1.8 million records of lithology from the well logs were used to create a lithologic-based representation of horizontal and vertical hydraulic conductivity of the glacial sediments. Specific-capacity data from about 292,000 well logs were converted into horizontal hydraulic conductivity values to determine specific values of horizontal hydraulic conductivity and its aerial variation. About 396,000 well logs contained data on ground-water levels that were assembled into a water-level calibration data set. A lithology-based distribution of hydraulic conductivity was created by use of a computer program to convert well-log lithology descriptions into aquifer or nonaquifer categories and to calculate equivalent horizontal and vertical hydraulic conductivities (K and KZ, respectively) for each of the glacial layers of the model. The K was based on an assumed value of 100 ft/d (feet per day) for aquifer materials and 1 ft/d for nonaquifer materials, whereas the equivalent KZ was based on an assumed value of 10 ft/d for aquifer materials and 0.001 ft/d for nonaquifer materials. These values were assumed for convenience to determine a relative contrast between aquifer and nonaquifer materials. The point values of K and KZ from wells that penetrate at least 50 percent of a model layer were interpolated into a grid of values. The K distribution was based on an inverse distance weighting equation that used an exponent of 2. The KZ distribution used inverse distance weighting with an exponent of 4 to represent the abrupt change in KZ that commonly occurs between aquifer and nonaquifer materials. The values of equivalent hydraulic conductivity for aquifer sediments needed to be adjusted to actual values in the study area for the ground-water flow modeling. The specific-capacity data (discharge, drawdown, and time data) from the well logs were input to a modified version of the Theis equation to calculate specific capacity based horizontal hydraulic conductivity values (KSC). The KSC values were used as a guide for adjusting the assumed value of 100 ft/d for aquifer deposits to actual values used in the model. Water levels from well logs were processed to improve reliability of water levels for comparison to simulated water levels in a model layer during model calibration. Water levels were interpolated by kriging to determine a composite water-level surface. The difference between the kriged surface and individual water levels was used to identify outlier water levels. Examination of the well-log lithology data in map form revealed that the data were not only useful for model input, but also were useful for understanding th
A Bayesian Nonparametric Meta-Analysis Model
ERIC Educational Resources Information Center
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
NASA Astrophysics Data System (ADS)
Kusaka, Takashi; Miyazaki, Go
2014-10-01
When monitoring target areas covered with vegetation from a satellite, it is very useful to estimate the vegetation index using the surface anisotropic reflectance, which is dependent on both solar and viewing geometries, from satellite data. In this study, the algorithm for estimating optical properties of atmospheric aerosols such as the optical thickness (τ), the refractive index (Nr), the mixing ratio of small particles in the bimodal log-normal distribution function (C) and the bidirectional reflectance (R) from only the radiance and polarization at the 865nm channel received by the PARASOL/POLDER is described. Parameters of the bimodal log-normal distribution function: mean radius, r1, standard deviation, σ1, of fine aerosols, and r2, σ2 of coarse aerosols were fixed, and these values were estimated from monthly averaged size distribution at AERONET sites managed by NASA near the target area. Moreover, it is assumed that the contribution of the surface reflectance with directional anisotropy to the polarized radiance received by the satellite is small because it is shown from our ground-based polarization measurements of light ray reflected by the grassland that degrees of polarization of the reflected light by the grassland are very low values at the 865nm channel. First aerosol properties were estimated from only the polarized radiance and then the bidirectional reflectance given by the Ross-Li BRDF model was estimated from only the total radiance at target areas in PARASOL/POLDER data over the Japanese islands taken on April 28, 2012 and April 25, 2010. The estimated optical thickness of aerosols was checked with those given in AERONET sites and the estimated parameters of BRDF were compared with those of vegetation measured from the radio-controlled helicopter. Consequently, it is shown that the algorithm described in the present study provides reasonable values for aerosol properties and surface bidirectional reflectance.
Empirical study of the tails of mutual fund size
NASA Astrophysics Data System (ADS)
Schwarzkopf, Yonathan; Farmer, J. Doyne
2010-06-01
The mutual fund industry manages about a quarter of the assets in the U.S. stock market and thus plays an important role in the U.S. economy. The question of how much control is concentrated in the hands of the largest players is best quantitatively discussed in terms of the tail behavior of the mutual fund size distribution. We study the distribution empirically and show that the tail is much better described by a log-normal than a power law, indicating less concentration than, for example, personal income. The results are highly statistically significant and are consistent across fifteen years. This contradicts a recent theory concerning the origin of the power law tails of the trading volume distribution. Based on the analysis in a companion paper, the log-normality is to be expected, and indicates that the distribution of mutual funds remains perpetually out of equilibrium.
Spencer, Amy V; Cox, Angela; Lin, Wei-Yu; Easton, Douglas F; Michailidou, Kyriaki; Walters, Kevin
2015-05-01
Bayes factors (BFs) are becoming increasingly important tools in genetic association studies, partly because they provide a natural framework for including prior information. The Wakefield BF (WBF) approximation is easy to calculate and assumes a normal prior on the log odds ratio (logOR) with a mean of zero. However, the prior variance (W) must be specified. Because of the potentially high sensitivity of the WBF to the choice of W, we propose several new BF approximations with logOR ∼N(0,W), but allow W to take a probability distribution rather than a fixed value. We provide several prior distributions for W which lead to BFs that can be calculated easily in freely available software packages. These priors allow a wide range of densities for W and provide considerable flexibility. We examine some properties of the priors and BFs and show how to determine the most appropriate prior based on elicited quantiles of the prior odds ratio (OR). We show by simulation that our novel BFs have superior true-positive rates at low false-positive rates compared to those from both P-value and WBF analyses across a range of sample sizes and ORs. We give an example of utilizing our BFs to fine-map the CASP8 region using genotype data on approximately 46,000 breast cancer case and 43,000 healthy control samples from the Collaborative Oncological Gene-environment Study (COGS) Consortium, and compare the single-nucleotide polymorphism ranks to those obtained using WBFs and P-values from univariate logistic regression. © 2015 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.
Multiple imputation in the presence of non-normal data.
Lee, Katherine J; Carlin, John B
2017-02-20
Multiple imputation (MI) is becoming increasingly popular for handling missing data. Standard approaches for MI assume normality for continuous variables (conditionally on the other variables in the imputation model). However, it is unclear how to impute non-normally distributed continuous variables. Using simulation and a case study, we compared various transformations applied prior to imputation, including a novel non-parametric transformation, to imputation on the raw scale and using predictive mean matching (PMM) when imputing non-normal data. We generated data from a range of non-normal distributions, and set 50% to missing completely at random or missing at random. We then imputed missing values on the raw scale, following a zero-skewness log, Box-Cox or non-parametric transformation and using PMM with both type 1 and 2 matching. We compared inferences regarding the marginal mean of the incomplete variable and the association with a fully observed outcome. We also compared results from these approaches in the analysis of depression and anxiety symptoms in parents of very preterm compared with term-born infants. The results provide novel empirical evidence that the decision regarding how to impute a non-normal variable should be based on the nature of the relationship between the variables of interest. If the relationship is linear in the untransformed scale, transformation can introduce bias irrespective of the transformation used. However, if the relationship is non-linear, it may be important to transform the variable to accurately capture this relationship. A useful alternative is to impute the variable using PMM with type 1 matching. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Refinement of the timing-based estimator of pulsar magnetic fields
NASA Astrophysics Data System (ADS)
Biryukov, Anton; Astashenok, Artyom; Beskin, Gregory
2017-04-01
Numerical simulations of realistic non-vacuum magnetospheres of isolated neutron stars have shown that pulsar spin-down luminosities depend weakly on the magnetic obliquity α. In particular, L ∝ B2(1 + sin 2α), where B is the magnetic field strength at the star surface. Being the most accurate expression to date, this result provides the opportunity to estimate B for a given radiopulsar with quite a high accuracy. In the current work, we present a refinement of the classical 'magneto-dipolar' formula for pulsar magnetic fields B_md = (3.2× 10^{19} G)√{P\\dot{P}}, where P is the neutron star spin period. The new, robust timing-based estimator is introduced as log B = log Bmd + ΔB(M, α), where the correction ΔB depends on the equation of state (EOS) of dense matter, the individual pulsar obliquity α and the mass M. Adopting state-of-the-art statistics for M and α we calculate the distributions of ΔB for a representative subset of 22 EOSs that do not contradict observations. It has been found that ΔB is distributed nearly normally, with the average in the range -0.5 to -0.25 dex and standard deviation σ[ΔB] ≈ 0.06 to 0.09 dex, depending on the adopted EOS. The latter quantity represents a formal uncertainty of the corrected estimation of log B because ΔB is weakly correlated with log Bmd. At the same time, if it is assumed that every considered EOS has the same chance of occurring in nature, then another, more generalized, estimator B* ≈ 3Bmd/7 can be introduced providing an unbiased value of the pulsar surface magnetic field with ˜30 per cent uncertainty with 68 per cent confidence. Finally, we discuss the possible impact of pulsar timing irregularities on the timing-based estimation of B and review the astrophysical applications of the obtained results.
Imanaka, Tetsuji; Fukutani, Satoshi; Yamamoto, Masayoshi; Sakaguchi, Aya; Hoshi, Masaharu
2006-02-01
Dolon village, located about 60 km from the border of the Semipalatinsk Nuclear Test Site, is known to be heavily contaminated by local fallout from the first USSR atomic bomb test in 1949. External radiation in Dolon was evaluated based on recent 137Cs data in soil and calculation of temporal change in the fission product composition. After fitting a log-normal distribution to the soil data, a 137Cs deposition of 32 kBq m-2, which corresponds to the 90th-percentile of the distribution, was tentatively chosen as a value to evaluate the radiation situation in 1949. Our calculation indicated that more than 95% of the cumulative dose for 50 y had been delivered within 1 y after the deposition. The resulting cumulative dose for 1 y after the deposition, normalized to the initial contamination containing 1 kBq m-2 of 137Cs, was 15.6 mGy, assuming a fallout arrival time of 3 h and a medium level of fractionation. Finally, 0.50 Gy of absorbed dose in air was derived as our tentative estimate for 1-year cumulative external dose in Dolon due to local fallout from the first USSR test in 1949.
NASA Astrophysics Data System (ADS)
Boyd, O. S.; Cramer, C. H.
2013-12-01
We develop an intensity prediction equation (IPE) for the Central and Eastern United States, explore differences between modified Mercalli intensities (MMI) and community internet intensities (CII) and the propensity for reporting, and estimate the moment magnitudes of the 1811-1812 New Madrid, MO, and 1886 Charleston, SC, earthquakes. We constrain the study with North American census data, the National Oceanic and Atmospheric Administration MMI dataset (responses between 1924 and 1985), and the USGS ';Did You Feel It?' CII dataset (responses between June, 2000 and August, 2012). The combined intensity dataset has more than 500,000 felt reports for 517 earthquakes with magnitudes between 2.5 and 7.2. The IPE has the basic form, MMI=c1+c2M+c3exp(λ)+c4λ. where M is moment magnitude and λ is mean log hypocentral distance. Previous IPEs use a limited dataset of MMI, do not differentiate between MMI and CII data in the CEUS, nor account for spatial variations in population. These factors can have an impact at all magnitudes, especially the last factor at large magnitudes and small intensities where the population drops to zero in the Atlantic Ocean and Gulf of Mexico. We assume that the number of reports of a given intensity have hypocentral distances that are log-normally distributed, the distribution of which is modulated by population and the propensity for individuals to report their experience. We do not account for variations in stress drop, regional variations in Q, or distance-dependent geometrical spreading. We simulate the distribution of reports of a given intensity accounting for population and use a grid search method to solve for the fraction of population to report the intensity, the standard deviation of the log-normal distribution and the mean log hypocentral distance, which appears in the above equation. We find that lower intensities, both CII and MMI, are less likely to be reported than greater intensities. Further, there are strong spatial variations in the level of CII reporting. For example, large metropolitan areas appear to have a lower level of reporting relative to rural areas. In general, we find that intensities decrease with increasing distance and decreasing magnitude, as expected. Coefficients for the IPE are c1=1.98×0.13 c2=1.76×0.02 c3=-0.0027×0.0004, and c4=-1.26×0.03. We find significant differences in mean log hypocentral distance between MMI- and CII-based reporting, particularly at smaller mean log distance and higher intensity. Values of mean log distance for CII at high intensity tend to be smaller than for MMI at the same value of intensity. The new IPE leads to magnitude estimates for the 1811-1812 New Madrid earthquakes that are within the broad range of those determined previously. Using three MMI datasets for the New Madrid mainshocks, the new relation results in estimates for the moment magnitudes of the December 16th, 1811, January 23rd, 1812, and February 7th, 1812 mainshocks and December 16th dawn aftershock of 7.1¬¬-7.4, 7.2, 7.5-7.7, and 6.7-7.2, respectively, with a magnitude uncertainty of about ×0.4 units. We estimate a magnitude of 7.0×0.3 for the 1886 Charleston, SC earthquake.
Logistic Approximation to the Normal: The KL Rationale
ERIC Educational Resources Information Center
Savalei, Victoria
2006-01-01
A rationale is proposed for approximating the normal distribution with a logistic distribution using a scaling constant based on minimizing the Kullback-Leibler (KL) information, that is, the expected amount of information available in a sample to distinguish between two competing distributions using a likelihood ratio (LR) test, assuming one of…
UV missile-plume signature model
NASA Astrophysics Data System (ADS)
Roblin, Antoine; Baudoux, Pierre E.; Chervet, Patrick
2002-08-01
A new 3D radiative code is used to solve the radiative transfer equation in the UV spectral domain for a nonequilibrium and axisymmetric media such as a rocket plume composed of hot reactive gases and metallic oxide particles like alumina. Calculations take into account the dominant chemiluminescence radiation mechanism and multiple scattering effects produced by alumina particles. Plume radiative properties are studied by using a simple cylindrical media of finite length, deduced from different aerothermochemical real rocket plume afterburning zones. Assumed a log-normal size distribution of alumina particles, optical properties are calculated by using Mie theory. Due to large uncertainties of particles properties, systematic tests have been performed in order to evaluate the influence of the different input data (refractive index, particle mean geometric radius) upon the radiance field. These computations will help us to define the set of parameters which need to be known accurately in order to compare computations with radiance measurements obtained during field experiments.
Possible Statistics of Two Coupled Random Fields: Application to Passive Scalar
NASA Technical Reports Server (NTRS)
Dubrulle, B.; He, Guo-Wei; Bushnell, Dennis M. (Technical Monitor)
2000-01-01
We use the relativity postulate of scale invariance to derive the similarity transformations between two coupled scale-invariant random elds at different scales. We nd the equations leading to the scaling exponents. This formulation is applied to the case of passive scalars advected i) by a random Gaussian velocity field; and ii) by a turbulent velocity field. In the Gaussian case, we show that the passive scalar increments follow a log-Levy distribution generalizing Kraichnan's solution and, in an appropriate limit, a log-normal distribution. In the turbulent case, we show that when the velocity increments follow a log-Poisson statistics, the passive scalar increments follow a statistics close to log-Poisson. This result explains the experimental observations of Ruiz et al. about the temperature increments.
Simulations of large acoustic scintillations in the straits of Florida.
Tang, Xin; Tappert, F D; Creamer, Dennis B
2006-12-01
Using a full-wave acoustic model, Monte Carlo numerical studies of intensity fluctuations in a realistic shallow water environment that simulates the Straits of Florida, including internal wave fluctuations and bottom roughness, have been performed. Results show that the sound intensity at distant receivers scintillates dramatically. The acoustic scintillation index SI increases rapidly with propagation range and is significantly greater than unity at ranges beyond about 10 km. This result supports a theoretical prediction by one of the authors. Statistical analyses show that the distribution of intensity of the random wave field saturates to the expected Rayleigh distribution with SI= 1 at short range due to multipath interference effects, and then SI continues to increase to large values. This effect, which is denoted supersaturation, is universal at long ranges in waveguides having lossy boundaries (where there is differential mode attenuation). The intensity distribution approaches a log-normal distribution to an excellent approximation; it may not be a universal distribution and comparison is also made to a K distribution. The long tails of the log-normal distribution cause "acoustic intermittency" in which very high, but rare, intensities occur.
An Empirical Bayes Approach to Mantel-Haenszel DIF Analysis.
ERIC Educational Resources Information Center
Zwick, Rebecca; Thayer, Dorothy T.; Lewis, Charles
1999-01-01
Developed an empirical Bayes enhancement to Mantel-Haenszel (MH) analysis of differential item functioning (DIF) in which it is assumed that the MH statistics are normally distributed and that the prior distribution of underlying DIF parameters is also normal. (Author/SLD)
Neti, Prasad V.S.V.; Howell, Roger W.
2010-01-01
Recently, the distribution of radioactivity among a population of cells labeled with 210Po was shown to be well described by a log-normal (LN) distribution function (J Nucl Med. 2006;47:1049–1058) with the aid of autoradiography. To ascertain the influence of Poisson statistics on the interpretation of the autoradiographic data, the present work reports on a detailed statistical analysis of these earlier data. Methods The measured distributions of α-particle tracks per cell were subjected to statistical tests with Poisson, LN, and Poisson-lognormal (P-LN) models. Results The LN distribution function best describes the distribution of radioactivity among cell populations exposed to 0.52 and 3.8 kBq/mL of 210Po-citrate. When cells were exposed to 67 kBq/mL, the P-LN distribution function gave a better fit; however, the underlying activity distribution remained log-normal. Conclusion The present analysis generally provides further support for the use of LN distributions to describe the cellular uptake of radioactivity. Care should be exercised when analyzing autoradiographic data on activity distributions to ensure that Poisson processes do not distort the underlying LN distribution. PMID:18483086
A mathematical model relating response durations to amount of subclinical resistant disease.
Gregory, W M; Richards, M A; Slevin, M L; Souhami, R L
1991-02-15
A mathematical model is presented which seeks to determine, from examination of the response durations of a group of patients with malignant disease, the mean and distribution of the resistant tumor volume. The mean tumor-doubling time and distribution of doubling times are also estimated. The model assumes that in a group of patients there is a log-normal distribution both of resistant disease and of tumor-doubling times and implies that the shapes of certain parts of an actuarial response-duration curve are related to these two factors. The model has been applied to data from two reported acute leukemia trials: (a) a recent acute myelogenous leukemia trial was examined. Close fits were obtained for both the first and second remission-duration curves. The model results suggested that patients with long first remissions had less resistant disease and had tumors with slower growth rates following second line treatment; (b) an historical study of maintenance therapy for acute lymphoblastic leukemia was used to estimate the mean cell-kill (approximately 10(4) cells) achieved with single agent, 6-mercaptopurine. Application of the model may have clinical relevance, for example, in identifying groups of patients likely to benefit from further intensification of treatment.
ERIC Educational Resources Information Center
Doerann-George, Judith
The Integrated Moving Average (IMA) model of time series, and the analysis of intervention effects based on it, assume random shocks which are normally distributed. To determine the robustness of the analysis to violations of this assumption, empirical sampling methods were employed. Samples were generated from three populations; normal,…
ERIC Educational Resources Information Center
Zu, Jiyun; Yuan, Ke-Hai
2012-01-01
In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…
McGee, Monnie; Chen, Zhongxue
2006-01-01
There are many methods of correcting microarray data for non-biological sources of error. Authors routinely supply software or code so that interested analysts can implement their methods. Even with a thorough reading of associated references, it is not always clear how requisite parts of the method are calculated in the software packages. However, it is important to have an understanding of such details, as this understanding is necessary for proper use of the output, or for implementing extensions to the model. In this paper, the calculation of parameter estimates used in Robust Multichip Average (RMA), a popular preprocessing algorithm for Affymetrix GeneChip brand microarrays, is elucidated. The background correction method for RMA assumes that the perfect match (PM) intensities observed result from a convolution of the true signal, assumed to be exponentially distributed, and a background noise component, assumed to have a normal distribution. A conditional expectation is calculated to estimate signal. Estimates of the mean and variance of the normal distribution and the rate parameter of the exponential distribution are needed to calculate this expectation. Simulation studies show that the current estimates are flawed; therefore, new ones are suggested. We examine the performance of preprocessing under the exponential-normal convolution model using several different methods to estimate the parameters.
Constraining the H2 column density distribution at z ˜ 3 from composite DLA spectra
NASA Astrophysics Data System (ADS)
Balashev, S. A.; Noterdaeme, P.
2018-07-01
We present the detection of the average H2 absorption signal in the overall population of neutral gas absorption systems at z˜ 3 using composite absorption spectra built from the Sloan Digital Sky Survey-III damped Lyman α catalogue. We present a new technique to directly measure the H2 column density distribution function f_H_2(N) from the average H2 absorption signal. Assuming a power-law column density distribution, we obtain a slope β = -1.29 ± 0.06(stat) ± 0.10 (sys) and an incidence rate of strong H2 absorptions [with N(H2) ≳ 1018 cm-2] to be 4.0 ± 0.5(stat) ± 1.0 (sys) per cent in H I absorption systems with N(H I) ≥1020 cm-2. Assuming the same inflexion point where f_H_2(N) steepens as at z = 0, we estimate that the cosmological density of H2 in the column density range log N(H_2) (cm^{-2})= 18{-}22 is {˜ } 15 per cent of the total. We find one order of magnitude higher H2 incident rate in a sub-sample of extremely strong damped Lyman α absorption systems (DLAs) [log N(H I) (cm^{-2}) ≥ 21.7], which, together with the derived shape of f_H_2(N), suggests that the typical H I-H2 transition column density in DLAs is log N(H)(cm-2) ≳ 22.3 in agreement with theoretical expectations for the average (low) metallicity of DLAs at high-z.
Constraining the H2 column density distribution at z˜3 from composite DLA spectra
NASA Astrophysics Data System (ADS)
Balashev, S. A.; Noterdaeme, P.
2018-04-01
We present the detection of the average H2 absorption signal in the overall population of neutral gas absorption systems at z ˜ 3 using composite absorption spectra built from the Sloan Digital Sky Survey-III damped Lyman-α catalogue. We present a new technique to directly measure the H2 column density distribution function f_H_2(N) from the average H2 absorption signal. Assuming a power-law column density distribution, we obtain a slope β = -1.29 ± 0.06(stat) ± 0.10 (sys) and an incidence rate of strong H2 absorptions (with N(H2) ≳ 1018 cm-2) to be 4.0 ± 0.5(stat) ± 1.0 (sys) % in H I absorption systems with N(H I)≥1020 cm-2. Assuming the same inflexion point where f_H_2(N) steepens as at z = 0, we estimate that the cosmological density of H2 in the column density range log N(H_2) (cm^{-2})= 18-22 is ˜15% of the total. We find one order of magnitude higher H2 incident rate in a sub-sample of extremely strong DLAs (log N(H I) (cm^{-2}) ≥ 21.7), which, together with the the derived shape of f_H_2(N), suggests that the typical H I-H2 transition column density in DLAs is log N(H)(cm-2) ≳ 22.3 in agreement with theoretical expectations for the average (low) metallicity of DLAs at high-z.
Probability distribution functions for unit hydrographs with optimization using genetic algorithm
NASA Astrophysics Data System (ADS)
Ghorbani, Mohammad Ali; Singh, Vijay P.; Sivakumar, Bellie; H. Kashani, Mahsa; Atre, Atul Arvind; Asadi, Hakimeh
2017-05-01
A unit hydrograph (UH) of a watershed may be viewed as the unit pulse response function of a linear system. In recent years, the use of probability distribution functions (pdfs) for determining a UH has received much attention. In this study, a nonlinear optimization model is developed to transmute a UH into a pdf. The potential of six popular pdfs, namely two-parameter gamma, two-parameter Gumbel, two-parameter log-normal, two-parameter normal, three-parameter Pearson distribution, and two-parameter Weibull is tested on data from the Lighvan catchment in Iran. The probability distribution parameters are determined using the nonlinear least squares optimization method in two ways: (1) optimization by programming in Mathematica; and (2) optimization by applying genetic algorithm. The results are compared with those obtained by the traditional linear least squares method. The results show comparable capability and performance of two nonlinear methods. The gamma and Pearson distributions are the most successful models in preserving the rising and recession limbs of the unit hydographs. The log-normal distribution has a high ability in predicting both the peak flow and time to peak of the unit hydrograph. The nonlinear optimization method does not outperform the linear least squares method in determining the UH (especially for excess rainfall of one pulse), but is comparable.
NASA Astrophysics Data System (ADS)
Usselman, Robert J.; Russek, Stephen E.; Klem, Michael T.; Allen, Mark A.; Douglas, Trevor; Young, Mark; Idzerda, Yves U.; Singel, David J.
2012-10-01
Electron magnetic resonance (EMR) spectroscopy was used to determine the magnetic properties of maghemite (γ-Fe2O3) nanoparticles formed within size-constraining Listeria innocua (LDps)-(DNA-binding protein from starved cells) protein cages that have an inner diameter of 5 nm. Variable-temperature X-band EMR spectra exhibited broad asymmetric resonances with a superimposed narrow peak at a gyromagnetic factor of g ≈ 2. The resonance structure, which depends on both superparamagnetic fluctuations and inhomogeneous broadening, changes dramatically as a function of temperature, and the overall linewidth becomes narrower with increasing temperature. Here, we compare two different models to simulate temperature-dependent lineshape trends. The temperature dependence for both models is derived from a Langevin behavior of the linewidth resulting from "anisotropy melting." The first uses either a truncated log-normal distribution of particle sizes or a bi-modal distribution and then a Landau-Liftshitz lineshape to describe the nanoparticle resonances. The essential feature of this model is that small particles have narrow linewidths and account for the g ≈ 2 feature with a constant resonance field, whereas larger particles have broad linewidths and undergo a shift in resonance field. The second model assumes uniform particles with a diameter around 4 nm and a random distribution of uniaxial anisotropy axes. This model uses a more precise calculation of the linewidth due to superparamagnetic fluctuations and a random distribution of anisotropies. Sharp features in the spectrum near g ≈ 2 are qualitatively predicted at high temperatures. Both models can account for many features of the observed spectra, although each has deficiencies. The first model leads to a nonphysical increase in magnetic moment as the temperature is increased if a log normal distribution of particles sizes is used. Introducing a bi-modal distribution of particle sizes resolves the unphysical increase in moment with temperature. The second model predicts low-temperature spectra that differ significantly from the observed spectra. The anisotropy energy density K1, determined by fitting the temperature-dependent linewidths, was ˜50 kJ/m3, which is considerably larger than that of bulk maghemite. The work presented here indicates that the magnetic properties of these size-constrained nanoparticles and more generally metal oxide nanoparticles with diameters d < 5 nm are complex and that currently existing models are not sufficient for determining their magnetic resonance signatures.
A Bayesian Surrogate for Regional Skew in Flood Frequency Analysis
NASA Astrophysics Data System (ADS)
Kuczera, George
1983-06-01
The problem of how to best utilize site and regional flood data to infer the shape parameter of a flood distribution is considered. One approach to this problem is given in Bulletin 17B of the U.S. Water Resources Council (1981) for the log-Pearson distribution. Here a lesser known distribution is considered, namely, the power normal which fits flood data as well as the log-Pearson and has a shape parameter denoted by λ derived from a Box-Cox power transformation. The problem of regionalizing λ is considered from an empirical Bayes perspective where site and regional flood data are used to infer λ. The distortive effects of spatial correlation and heterogeneity of site sampling variance of λ are explicitly studied with spatial correlation being found to be of secondary importance. The end product of this analysis is the posterior distribution of the power normal parameters expressing, in probabilistic terms, what is known about the parameters given site flood data and regional information on λ. This distribution can be used to provide the designer with several types of information. The posterior distribution of the T-year flood is derived. The effect of nonlinearity in λ on inference is illustrated. Because uncertainty in λ is explicitly allowed for, the understatement in confidence limits due to fixing λ (analogous to fixing log skew) is avoided. Finally, it is shown how to obtain the marginal flood distribution which can be used to select a design flood with specified exceedance probability.
NASA Astrophysics Data System (ADS)
Alahmadi, F.; Rahman, N. A.; Abdulrazzak, M.
2014-09-01
Rainfall frequency analysis is an essential tool for the design of water related infrastructure. It can be used to predict future flood magnitudes for a given magnitude and frequency of extreme rainfall events. This study analyses the application of rainfall partial duration series (PDS) in the vast growing urban Madinah city located in the western part of Saudi Arabia. Different statistical distributions were applied (i.e. Normal, Log Normal, Extreme Value type I, Generalized Extreme Value, Pearson Type III, Log Pearson Type III) and their distribution parameters were estimated using L-moments methods. Also, different selection criteria models are applied, e.g. Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (AICc), Bayesian Information Criterion (BIC) and Anderson-Darling Criterion (ADC). The analysis indicated the advantage of Generalized Extreme Value as the best fit statistical distribution for Madinah partial duration daily rainfall series. The outcome of such an evaluation can contribute toward better design criteria for flood management, especially flood protection measures.
Fujikawa, Hiroshi
2017-01-01
Microbial concentration in samples of a food product lot has been generally assumed to follow the log-normal distribution in food sampling, but this distribution cannot accommodate the concentration of zero. In the present study, first, a probabilistic study with the most probable number (MPN) technique was done for a target microbe present at a low (or zero) concentration in food products. Namely, based on the number of target pathogen-positive samples in the total samples of a product found by a qualitative, microbiological examination, the concentration of the pathogen in the product was estimated by means of the MPN technique. The effects of the sample size and the total sample number of a product were then examined. Second, operating characteristic (OC) curves for the concentration of a target microbe in a product lot were generated on the assumption that the concentration of a target microbe could be expressed with the Poisson distribution. OC curves for Salmonella and Cronobacter sakazakii in powdered formulae for infants and young children were successfully generated. The present study suggested that the MPN technique and the Poisson distribution would be useful for qualitative microbiological test data analysis for a target microbe whose concentration in a lot is expected to be low.
Bayes classification of terrain cover using normalized polarimetric data
NASA Technical Reports Server (NTRS)
Yueh, H. A.; Swartz, A. A.; Kong, J. A.; Shin, R. T.; Novak, L. M.
1988-01-01
The normalized polarimetric classifier (NPC) which uses only the relative magnitudes and phases of the polarimetric data is proposed for discrimination of terrain elements. The probability density functions (PDFs) of polarimetric data are assumed to have a complex Gaussian distribution, and the marginal PDF of the normalized polarimetric data is derived by adopting the Euclidean norm as the normalization function. The general form of the distance measure for the NPC is also obtained. It is demonstrated that for polarimetric data with an arbitrary PDF, the distance measure of NPC will be independent of the normalization function selected even when the classifier is mistrained. A complex Gaussian distribution is assumed for the polarimetric data consisting of grass and tree regions. The probability of error for the NPC is compared with those of several other single-feature classifiers. The classification error of NPCs is shown to be independent of the normalization function.
Dekkers, A L M; Slob, W
2012-10-01
In dietary exposure assessment, statistical methods exist for estimating the usual intake distribution from daily intake data. These methods transform the dietary intake data to normal observations, eliminate the within-person variance, and then back-transform the data to the original scale. We propose Gaussian Quadrature (GQ), a numerical integration method, as an efficient way of back-transformation. We compare GQ with six published methods. One method uses a log-transformation, while the other methods, including GQ, use a Box-Cox transformation. This study shows that, for various parameter choices, the methods with a Box-Cox transformation estimate the theoretical usual intake distributions quite well, although one method, a Taylor approximation, is less accurate. Two applications--on folate intake and fruit consumption--confirmed these results. In one extreme case, some methods, including GQ, could not be applied for low percentiles. We solved this problem by modifying GQ. One method is based on the assumption that the daily intakes are log-normally distributed. Even if this condition is not fulfilled, the log-transformation performs well as long as the within-individual variance is small compared to the mean. We conclude that the modified GQ is an efficient, fast and accurate method for estimating the usual intake distribution. Copyright © 2012 Elsevier Ltd. All rights reserved.
Money-center structures in dynamic banking systems
NASA Astrophysics Data System (ADS)
Li, Shouwei; Zhang, Minghui
2016-10-01
In this paper, we propose a dynamic model for banking systems based on the description of balance sheets. It generates some features identified through empirical analysis. Through simulation analysis of the model, we find that banking systems have the feature of money-center structures, that bank asset distributions are power-law distributions, and that contract size distributions are log-normal distributions.
Multivariate stochastic simulation with subjective multivariate normal distributions
P. J. Ince; J. Buongiorno
1991-01-01
In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...
Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A
2006-10-15
Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.
Single-trial log transformation is optimal in frequency analysis of resting EEG alpha.
Smulders, Fren T Y; Ten Oever, Sanne; Donkers, Franc C L; Quaedflieg, Conny W E M; van de Ven, Vincent
2018-02-01
The appropriate definition and scaling of the magnitude of electroencephalogram (EEG) oscillations is an underdeveloped area. The aim of this study was to optimize the analysis of resting EEG alpha magnitude, focusing on alpha peak frequency and nonlinear transformation of alpha power. A family of nonlinear transforms, Box-Cox transforms, were applied to find the transform that (a) maximized a non-disputed effect: the increase in alpha magnitude when the eyes are closed (Berger effect), and (b) made the distribution of alpha magnitude closest to normal across epochs within each participant, or across participants. The transformations were performed either at the single epoch level or at the epoch-average level. Alpha peak frequency showed large individual differences, yet good correspondence between various ways to estimate it in 2 min of eyes-closed and 2 min of eyes-open resting EEG data. Both alpha magnitude and the Berger effect were larger for individual alpha than for a generic (8-12 Hz) alpha band. The log-transform on single epochs (a) maximized the t-value of the contrast between the eyes-open and eyes-closed conditions when tested within each participant, and (b) rendered near-normally distributed alpha power across epochs and participants, thereby making further transformation of epoch averages superfluous. The results suggest that the log-normal distribution is a fundamental property of variations in alpha power across time in the order of seconds. Moreover, effects on alpha power appear to be multiplicative rather than additive. These findings support the use of the log-transform on single epochs to achieve appropriate scaling of alpha magnitude. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Shape of growth-rate distribution determines the type of Non-Gibrat’s Property
NASA Astrophysics Data System (ADS)
Ishikawa, Atushi; Fujimoto, Shouji; Mizuno, Takayuki
2011-11-01
In this study, the authors examine exhaustive business data on Japanese firms, which cover nearly all companies in the mid- and large-scale ranges in terms of firm size, to reach several key findings on profits/sales distribution and business growth trends. Here, profits denote net profits. First, detailed balance is observed not only in profits data but also in sales data. Furthermore, the growth-rate distribution of sales has wider tails than the linear growth-rate distribution of profits in log-log scale. On the one hand, in the mid-scale range of profits, the probability of positive growth decreases and the probability of negative growth increases symmetrically as the initial value increases. This is called Non-Gibrat’s First Property. On the other hand, in the mid-scale range of sales, the probability of positive growth decreases as the initial value increases, while the probability of negative growth hardly changes. This is called Non-Gibrat’s Second Property. Under detailed balance, Non-Gibrat’s First and Second Properties are analytically derived from the linear and quadratic growth-rate distributions in log-log scale, respectively. In both cases, the log-normal distribution is inferred from Non-Gibrat’s Properties and detailed balance. These analytic results are verified by empirical data. Consequently, this clarifies the notion that the difference in shapes between growth-rate distributions of sales and profits is closely related to the difference between the two Non-Gibrat’s Properties in the mid-scale range.
A spatial scan statistic for survival data based on Weibull distribution.
Bhatt, Vijaya; Tiwari, Neeraj
2014-05-20
The spatial scan statistic has been developed as a geographical cluster detection analysis tool for different types of data sets such as Bernoulli, Poisson, ordinal, normal and exponential. We propose a scan statistic for survival data based on Weibull distribution. It may also be used for other survival distributions, such as exponential, gamma, and log normal. The proposed method is applied on the survival data of tuberculosis patients for the years 2004-2005 in Nainital district of Uttarakhand, India. Simulation studies reveal that the proposed method performs well for different survival distribution functions. Copyright © 2013 John Wiley & Sons, Ltd.
NMR measurement of bitumen at different temperatures.
Yang, Zheng; Hirasaki, George J
2008-06-01
Heavy oil (bitumen) is characterized by its high viscosity and density, which is a major obstacle to both well logging and recovery. Due to the lost information of T2 relaxation time shorter than echo spacing (TE) and interference of water signal, estimation of heavy oil properties from NMR T2 measurements is usually problematic. In this work, a new method has been developed to overcome the echo spacing restriction of NMR spectrometer during the application to heavy oil (bitumen). A FID measurement supplemented the start of CPMG. Constrained by its initial magnetization (M0) estimated from the FID and assuming log normal distribution for bitumen, the corrected T2 relaxation time of bitumen sample can be obtained from the interpretation of CPMG data. This new method successfully overcomes the TE restriction of the NMR spectrometer and is nearly independent on the TE applied in the measurement. This method was applied to the measurement at elevated temperatures (8-90 degrees C). Due to the significant signal-loss within the dead time of FID, the directly extrapolated M0 of bitumen at relatively lower temperatures (<60 degrees C) was found to be underestimated. However, resulting from the remarkably lowered viscosity, the extrapolated M0 of bitumen at over 60 degrees C can be reasonably assumed to be the real value. In this manner, based on the extrapolation at higher temperatures (> or = 60 degrees C), the M0 value of bitumen at lower temperatures (<60 degrees C) can be corrected by Curie's Law. Consequently, some important petrophysical properties of bitumen, such as hydrogen index (HI), fluid content and viscosity were evaluated by using corrected T2.
NASA Astrophysics Data System (ADS)
Jumelet, Julien; David, Christine; Bekki, Slimane; Keckhut, Philippe
2009-01-01
The determination of stratospheric particle microphysical properties from multiwavelength lidar, including Rayleigh and/or Raman detection, has been widely investigated. However, most lidar systems are uniwavelength operating at 532 nm. Although the information content of such lidar data is too limited to allow the retrieval of the full size distribution, the coupling of two or more uniwavelength lidar measurements probing the same moving air parcel may provide some meaningful size information. Within the ORACLE-O3 IPY project, the coordination of several ground-based lidars and the CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) space-borne lidar is planned during measurement campaigns called MATCH-PSC (Polar Stratospheric Clouds). While probing the same moving air masses, the evolution of the measured backscatter coefficient (BC) should reflect the variation of particles microphysical properties. A sensitivity study of 532 nm lidar particle backscatter to variations of particles size distribution parameters is carried out. For simplicity, the particles are assumed to be spherical (liquid) particles and the size distribution is represented with a unimodal log-normal distribution. Each of the four microphysical parameters (i.e. log-normal size distribution parameters, refractive index) are analysed separately, while the three others are remained set to constant reference values. Overall, the BC behaviour is not affected by the initial values taken as references. The total concentration (N0) is the parameter to which BC is least sensitive, whereas it is most sensitive to the refractive index (m). A 2% variation of m induces a 15% variation of the lidar BC, while the uncertainty on the BC retrieval can also reach 15%. This result underlines the importance of having both an accurate lidar inversion method and a good knowledge of the temperature for size distribution retrieval techniques. The standard deviation ([sigma]) is the second parameter to which BC is most sensitive to. Yet, the impact of m and [sigma] on BC variations is limited by the realistic range of their variations. The mean radius (rm) of the size distribution is thus the key parameter for BC, as it can vary several-fold. BC is most sensitive to the presence of large particles. The sensitivity of BC to rm and [sigma] variations increases when the initial size distributions are characterized by low rm and large [sigma]. This makes lidar more suitable to detect particles growing on background aerosols than on volcanic aerosols.
Bidisperse and polydisperse suspension rheology at large solid fraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pednekar, Sidhant; Chun, Jaehun; Morris, Jeffrey F.
At the same solid volume fraction, bidisperse and polydisperse suspensions display lower viscosities, and weaker normal stress response, compared to monodisperse suspensions. The reduction of viscosity associated with size distribution can be explained by an increase of the maximum flowable, or jamming, solid fraction. In this work, concentrated or "dense" suspensions are simulated under strong shearing, where thermal motion and repulsive forces are negligible, but we allow for particle contact with a mild frictional interaction with interparticle friction coefficient of 0.2. Aspects of bidisperse suspension rheology are first revisited to establish that the approach reproduces established trends; the study ofmore » bidisperse suspensions at size ratios of large to small particle radii (2 to 4) shows that a minimum in the viscosity occurs for zeta slightly above 0.5, where zeta=phi_{large}/phi is the fraction of the total solid volume occupied by the large particles. The simple shear flows of polydisperse suspensions with truncated normal and log normal size distributions, and bidisperse suspensions which are statistically equivalent with these polydisperse cases up to third moment of the size distribution, are simulated and the rheologies are extracted. Prior work shows that such distributions with equivalent low-order moments have similar phi_{m}, and the rheological behaviors of normal, log normal and bidisperse cases are shown to be in close agreement for a wide range of standard deviation in particle size, with standard correlations which are functionally dependent on phi/phi_{m} providing excellent agreement with the rheology found in simulation. The close agreement of both viscosity and normal stress response between bi- and polydisperse suspensions demonstrates the controlling in influence of the maximum packing fraction in noncolloidal suspensions. Microstructural investigations and the stress distribution according to particle size are also presented.« less
A theory for modeling ground-water flow in heterogeneous media
Cooley, Richard L.
2004-01-01
Construction of a ground-water model for a field area is not a straightforward process. Data are virtually never complete or detailed enough to allow substitution into the model equations and direct computation of the results of interest. Formal model calibration through optimization, statistical, and geostatistical methods is being applied to an increasing extent to deal with this problem and provide for quantitative evaluation and uncertainty analysis of the model. However, these approaches are hampered by two pervasive problems: 1) nonlinearity of the solution of the model equations with respect to some of the model (or hydrogeologic) input variables (termed in this report system characteristics) and 2) detailed and generally unknown spatial variability (heterogeneity) of some of the system characteristics such as log hydraulic conductivity, specific storage, recharge and discharge, and boundary conditions. A theory is developed in this report to address these problems. The theory allows construction and analysis of a ground-water model of flow (and, by extension, transport) in heterogeneous media using a small number of lumped or smoothed system characteristics (termed parameters). The theory fully addresses both nonlinearity and heterogeneity in such a way that the parameters are not assumed to be effective values. The ground-water flow system is assumed to be adequately characterized by a set of spatially and temporally distributed discrete values, ?, of the system characteristics. This set contains both small-scale variability that cannot be described in a model and large-scale variability that can. The spatial and temporal variability in ? are accounted for by imagining ? to be generated by a stochastic process wherein ? is normally distributed, although normality is not essential. Because ? has too large a dimension to be estimated using the data normally available, for modeling purposes ? is replaced by a smoothed or lumped approximation y?. (where y is a spatial and temporal interpolation matrix). Set y?. has the same form as the expected value of ?, y 'line' ? , where 'line' ? is the set of drift parameters of the stochastic process; ?. is a best-fit vector to ?. A model function f(?), such as a computed hydraulic head or flux, is assumed to accurately represent an actual field quantity, but the same function written using y?., f(y?.), contains error from lumping or smoothing of ? using y?.. Thus, the replacement of ? by y?. yields nonzero mean model errors of the form E(f(?)-f(y?.)) throughout the model and covariances between model errors at points throughout the model. These nonzero means and covariances are evaluated through third and fifth-order accuracy, respectively, using Taylor series expansions. They can have a significant effect on construction and interpretation of a model that is calibrated by estimating ?.. Vector ?.. is estimated as 'hat' ? using weighted nonlinear least squares techniques to fit a set of model functions f(y'hat' ?) to a. corresponding set of observations of f(?), Y. These observations are assumed to be corrupted by zero-mean, normally distributed observation errors, although, as for ?, normality is not essential. An analytical approximation of the nonlinear least squares solution is obtained using Taylor series expansions and perturbation techniques that assume model and observation errors to be small. This solution is used to evaluate biases and other results to second-order accuracy in the errors. The correct weight matrix to use in the analysis is shown to be the inverse of the second-moment matrix E(Y-f(y?.))(Y-f(y?.))', but the weight matrix is assumed to be arbitrary in most developments. The best diagonal approximation is the inverse of the matrix of diagonal elements of E(Y-f(y?.))(Y-f(y?.))', and a method of estimating this diagonal matrix when it is unknown is developed using a special objective function to compute 'hat' ?. When considered to be an estimate of f
NASA Astrophysics Data System (ADS)
Gross, Lutz; Tyson, Stephen
2015-04-01
Fracture density and orientation are key parameters controlling productivity of coal seam gas reservoirs. Seismic anisotropy can help to identify and quantify fracture characteristics. In particular, wide offset and dense azimuthal coverage land seismic recordings offers the opportunity for recovery of anisotropy parameters. In many coal seam gas reservoirs (eg. Walloon Subgroup in the Surat Basin, Queensland, Australia (Esterle et al. 2013)) the thickness of coal-beds and interbeds (e.g mud-stone) are well below the seismic wave length (0.3-1m versus 5-15m). In these situations, the observed seismic anisotropy parameters represent effective elastic properties of the composite media formed of fractured, anisotropic coal and isotropic interbed. As a consequence observed seismic anisotropy cannot directly be linked to fracture characteristics but requires a more careful interpretation. In the paper we will discuss techniques to estimate effective seismic anisotropy parameters from well log data with the objective to improve the interpretation for the case of layered thin coal beds. In the first step we use sonic log data to reconstruct the elasticity parameters as function of depth (at the resolution of the sonic log). It is assumed that within a sample fractures are sparse, of the same size and orientation, penny-shaped and equally spaced. Following classical fracture model this can be modeled as an elastic horizontally transversely isotropic (HTI) media (Schoenberg & Sayers 1995). Under the additional assumption of dry fractures, normal and tangential fracture weakness is estimated from slow and fast shear wave velocities of the sonic log. In the second step we apply Backus-style upscaling to construct effective anisotropy parameters on an appropriate length scale. In order to honor the HTI anisotropy present at each layer we have developed a new extension of the classical Backus averaging for layered isotropic media (Backus 1962) . Our new method assumes layered HTI media with constant anisotropy orientation as recovered in the first step. It leads to an effective horizontal orthorhombic elastic model. From this model Thomsen-style anisotropy parameters are calculated to derive azimuth-dependent normal move out (NMO) velocities (see Grechka & Tsvankin 1998). In our presentation we will show results of our approach from sonic well logs in the Surat Basin to investigate the potential of reconstructing S-wave velocity anisotropy and fracture density from azimuth dependent NMO velocities profiles.
NASA Astrophysics Data System (ADS)
Alimi, Isiaka; Shahpari, Ali; Ribeiro, Vítor; Sousa, Artur; Monteiro, Paulo; Teixeira, António
2017-05-01
In this paper, we present experimental results on channel characterization of single input single output (SISO) free-space optical (FSO) communication link that is based on channel measurements. The histograms of the FSO channel samples and the log-normal distribution fittings are presented along with the measured scintillation index. Furthermore, we extend our studies to diversity schemes and propose a closed-form expression for determining ergodic channel capacity of multiple input multiple output (MIMO) FSO communication systems over atmospheric turbulence fading channels. The proposed empirical model is based on SISO FSO channel characterization. Also, the scintillation effects on the system performance are analyzed and results for different turbulence conditions are presented. Moreover, we observed that the histograms of the FSO channel samples that we collected from a 1548.51 nm link have good fits with log-normal distributions and the proposed model for MIMO FSO channel capacity is in conformity with the simulation results in terms of normalized mean-square error (NMSE).
On the scaling of the distribution of daily price fluctuations in the Mexican financial market index
NASA Astrophysics Data System (ADS)
Alfonso, Léster; Mansilla, Ricardo; Terrero-Escalante, César A.
2012-05-01
In this paper, a statistical analysis of log-return fluctuations of the IPC, the Mexican Stock Market Index is presented. A sample of daily data covering the period from 04/09/2000-04/09/2010 was analyzed, and fitted to different distributions. Tests of the goodness of fit were performed in order to quantitatively asses the quality of the estimation. Special attention was paid to the impact of the size of the sample on the estimated decay of the distributions tail. In this study a forceful rejection of normality was obtained. On the other hand, the null hypothesis that the log-fluctuations are fitted to a α-stable Lévy distribution cannot be rejected at the 5% significance level.
New approach application of data transformation in mean centering of ratio spectra method
NASA Astrophysics Data System (ADS)
Issa, Mahmoud M.; Nejem, R.'afat M.; Van Staden, Raluca Ioana Stefan; Aboul-Enein, Hassan Y.
2015-05-01
Most of mean centering (MCR) methods are designed to be used with data sets whose values have a normal or nearly normal distribution. The errors associated with the values are also assumed to be independent and random. If the data are skewed, the results obtained may be doubtful. Most of the time, it was assumed a normal distribution and if a confidence interval includes a negative value, it was cut off at zero. However, it is possible to transform the data so that at least an approximately normal distribution is attained. Taking the logarithm of each data point is one transformation frequently used. As a result, the geometric mean is deliberated a better measure of central tendency than the arithmetic mean. The developed MCR method using the geometric mean has been successfully applied to the analysis of a ternary mixture of aspirin (ASP), atorvastatin (ATOR) and clopidogrel (CLOP) as a model. The results obtained were statistically compared with reported HPLC method.
Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A; Ono, Yutaka
2016-01-01
Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D) questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items). The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an exponential mathematical pattern.
Kawasaki, Yohei; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A.; Ono, Yutaka
2016-01-01
Background Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Methods Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D) questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items). The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. Results The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. Discussion The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an exponential mathematical pattern. PMID:27761346
Growth models and the expected distribution of fluctuating asymmetry
Graham, John H.; Shimizu, Kunio; Emlen, John M.; Freeman, D. Carl; Merkel, John
2003-01-01
Multiplicative error accounts for much of the size-scaling and leptokurtosis in fluctuating asymmetry. It arises when growth involves the addition of tissue to that which is already present. Such errors are lognormally distributed. The distribution of the difference between two lognormal variates is leptokurtic. If those two variates are correlated, then the asymmetry variance will scale with size. Inert tissues typically exhibit additive error and have a gamma distribution. Although their asymmetry variance does not exhibit size-scaling, the distribution of the difference between two gamma variates is nevertheless leptokurtic. Measurement error is also additive, but has a normal distribution. Thus, the measurement of fluctuating asymmetry may involve the mixing of additive and multiplicative error. When errors are multiplicative, we recommend computing log E(l) − log E(r), the difference between the logarithms of the expected values of left and right sides, even when size-scaling is not obvious. If l and r are lognormally distributed, and measurement error is nil, the resulting distribution will be normal, and multiplicative error will not confound size-related changes in asymmetry. When errors are additive, such a transformation to remove size-scaling is unnecessary. Nevertheless, the distribution of l − r may still be leptokurtic.
2015-01-01
Among co-occurring species, values for functionally important plant traits span orders of magnitude, are uni-modal, and generally positively skewed. Such data are usually log-transformed “for normality” but no convincing mechanistic explanation for a log-normal expectation exists. Here we propose a hypothesis for the distribution of seed masses based on generalised extreme value distributions (GEVs), a class of probability distributions used in climatology to characterise the impact of event magnitudes and frequencies; events that impose strong directional selection on biological traits. In tests involving datasets from 34 locations across the globe, GEVs described log10 seed mass distributions as well or better than conventional normalising statistics in 79% of cases, and revealed a systematic tendency for an overabundance of small seed sizes associated with low latitudes. GEVs characterise disturbance events experienced in a location to which individual species’ life histories could respond, providing a natural, biological explanation for trait expression that is lacking from all previous hypotheses attempting to describe trait distributions in multispecies assemblages. We suggest that GEVs could provide a mechanistic explanation for plant trait distributions and potentially link biology and climatology under a single paradigm. PMID:25830773
Statistical analysis of variability properties of the Kepler blazar W2R 1926+42
NASA Astrophysics Data System (ADS)
Li, Yutong; Hu, Shaoming; Wiita, Paul J.; Gupta, Alok C.
2018-04-01
We analyzed Kepler light curves of the blazar W2R 1926+42 that provided nearly continuous coverage from quarter 11 through quarter 17 (589 days between 2011 and 2013) and examined some of their flux variability properties. We investigate the possibility that the light curve is dominated by a large number of individual flares and adopt exponential rise and decay models to investigate the symmetry properties of flares. We found that those variations of W2R 1926+42 are predominantly asymmetric with weak tendencies toward positive asymmetry (rapid rise and slow decay). The durations (D) and the amplitudes (F0) of flares can be fit with log-normal distributions. The energy (E) of each flare is also estimated for the first time. There are positive correlations between logD and logE with a slope of 1.36, and between logF0 and logE with a slope of 1.12. Lomb-Scargle periodograms are used to estimate the power spectral density (PSD) shape. It is well described by a power law with an index ranging between -1.1 and -1.5. The sizes of the emission regions, R, are estimated to be in the range of 1.1 × 1015cm - 6.6 × 1016cm. The flare asymmetry is difficult to explain by a light travel time effect but may be caused by differences between the timescales for acceleration and dissipation of high-energy particles in the relativistic jet. A jet-in-jet model also could produce the observed log-normal distributions.
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Mechanism-based model for tumor drug resistance.
Kuczek, T; Chan, T C
1992-01-01
The development of tumor resistance to cytotoxic agents has important implications in the treatment of cancer. If supported by experimental data, mathematical models of resistance can provide useful information on the underlying mechanisms and aid in the design of therapeutic regimens. We report on the development of a model of tumor-growth kinetics based on the assumption that the rates of cell growth in a tumor are normally distributed. We further assumed that the growth rate of each cell is proportional to its rate of total pyrimidine synthesis (de novo plus salvage). Using an ovarian carcinoma cell line (2008) and resistant variants selected for chronic exposure to a pyrimidine antimetabolite, N-phosphonacetyl-L-aspartate (PALA), we derived a simple and specific analytical form describing the growth curves generated in 72 h growth assays. The model assumes that the rate of de novo pyrimidine synthesis, denoted alpha, is shifted down by an amount proportional to the log10 PALA concentration and that cells whose rate of pyrimidine synthesis falls below a critical level, denoted alpha 0, can no longer grow. This is described by the equation: Probability (growth) = probability (alpha 0 less than alpha-constant x log10 [PALA]). This model predicts that when growth curves are plotted on probit paper, they will produce straight lines. This prediction is in agreement with the data we obtained for the 2008 cells. Another prediction of this model is that the same probit plots for the resistant variants should shift to the right in a parallel fashion. Probit plots of the dose-response data obtained for each resistant 2008 line following chronic exposure to PALA again confirmed this prediction. Correlation of the rightward shift of dose responses to uridine transport (r = 0.99) also suggests that salvage metabolism plays a key role in tumor-cell resistance to PALA. Furthermore, the slope of the regression lines enables the detection of synergy such as that observed between dipyridamole and PALA. Although the rate-normal model was used to study the rate of salvage metabolism in PALA resistance in the present study, it may be widely applicable to modeling of other resistance mechanisms such as gene amplification of target enzymes.
Persiani, Anna Maria; Maggi, Oriana
2013-01-01
Experimental fires, of both low and high intensity, were lit during summer 2000 and the following 2 y in the Castel Volturno Nature Reserve, southern Italy. Soil samples were collected Jul 2000-Jul 2002 to analyze the soil fungal community dynamics. Species abundance distribution patterns (geometric, logarithmic, log normal, broken-stick) were compared. We plotted datasets with information both on species richness and abundance for total, xerotolerant and heat-stimulated soil microfungi. The xerotolerant fungi conformed to a broken-stick model for both the low- and high intensity fires at 7 and 84 d after the fire; their distribution subsequently followed logarithmic models in the 2 y following the fire. The distribution of the heat-stimulated fungi changed from broken-stick to logarithmic models and eventually to a log-normal model during the post-fire recovery. Xerotolerant and, to a far greater extent, heat-stimulated soil fungi acquire an important functional role following soil water stress and/or fire disturbance; these disturbances let them occupy unsaturated habitats and become increasingly abundant over time.
Bridging stylized facts in finance and data non-stationarities
NASA Astrophysics Data System (ADS)
Camargo, Sabrina; Duarte Queirós, Sílvio M.; Anteneodo, Celia
2013-04-01
Employing a recent technique which allows the representation of nonstationary data by means of a juxtaposition of locally stationary paths of different length, we introduce a comprehensive analysis of the key observables in a financial market: the trading volume and the price fluctuations. From the segmentation procedure we are able to introduce a quantitative description of statistical features of these two quantities, which are often named stylized facts, namely the tails of the distribution of trading volume and price fluctuations and a dynamics compatible with the U-shaped profile of the volume in a trading section and the slow decay of the autocorrelation function. The segmentation of the trading volume series provides evidence of slow evolution of the fluctuating parameters of each patch, pointing to the mixing scenario. Assuming that long-term features are the outcome of a statistical mixture of simple local forms, we test and compare different probability density functions to provide the long-term distribution of the trading volume, concluding that the log-normal gives the best agreement with the empirical distribution. Moreover, the segmentation of the magnitude price fluctuations are quite different from the results for the trading volume, indicating that changes in the statistics of price fluctuations occur at a faster scale than in the case of trading volume.
Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin
2016-01-01
In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680
NASA Astrophysics Data System (ADS)
Barbarino, M.; Warrens, M.; Bonasera, A.; Lattuada, D.; Bang, W.; Quevedo, H. J.; Consoli, F.; de Angelis, R.; Andreoli, P.; Kimura, S.; Dyer, G.; Bernstein, A. C.; Hagel, K.; Barbui, M.; Schmidt, K.; Gaul, E.; Donovan, M. E.; Natowitz, J. B.; Ditmire, T.
2016-08-01
In this work, we explore the possibility that the motion of the deuterium ions emitted from Coulomb cluster explosions is highly disordered enough to resemble thermalization. We analyze the process of nuclear fusion reactions driven by laser-cluster interactions in experiments conducted at the Texas Petawatt laser facility using a mixture of D2+3He and CD4+3He cluster targets. When clusters explode by Coulomb repulsion, the emission of the energetic ions is “nearly” isotropic. In the framework of cluster Coulomb explosions, we analyze the energy distributions of the ions using a Maxwell-Boltzmann (MB) distribution, a shifted MB distribution (sMB), and the energy distribution derived from a log-normal (LN) size distribution of clusters. We show that the first two distributions reproduce well the experimentally measured ion energy distributions and the number of fusions from d-d and d-3He reactions. The LN distribution is a good representation of the ion kinetic energy distribution well up to high momenta where the noise becomes dominant, but overestimates both the neutron and the proton yields. If the parameters of the LN distributions are chosen to reproduce the fusion yields correctly, the experimentally measured high energy ion spectrum is not well represented. We conclude that the ion kinetic energy distribution is highly disordered and practically not distinguishable from a thermalized one.
A comparative study of mixture cure models with covariate
NASA Astrophysics Data System (ADS)
Leng, Oh Yit; Khalid, Zarina Mohd
2017-05-01
In survival analysis, the survival time is assumed to follow a non-negative distribution, such as the exponential, Weibull, and log-normal distributions. In some cases, the survival time is influenced by some observed factors. The absence of these observed factors may cause an inaccurate estimation in the survival function. Therefore, a survival model which incorporates the influences of observed factors is more appropriate to be used in such cases. These observed factors are included in the survival model as covariates. Besides that, there are cases where a group of individuals who are cured, that is, not experiencing the event of interest. Ignoring the cure fraction may lead to overestimate in estimating the survival function. Thus, a mixture cure model is more suitable to be employed in modelling survival data with the presence of a cure fraction. In this study, three mixture cure survival models are used to analyse survival data with a covariate and a cure fraction. The first model includes covariate in the parameterization of the susceptible individuals survival function, the second model allows the cure fraction to depend on covariate, and the third model incorporates covariate in both cure fraction and survival function of susceptible individuals. This study aims to compare the performance of these models via a simulation approach. Therefore, in this study, survival data with varying sample sizes and cure fractions are simulated and the survival time is assumed to follow the Weibull distribution. The simulated data are then modelled using the three mixture cure survival models. The results show that the three mixture cure models are more appropriate to be used in modelling survival data with the presence of cure fraction and an observed factor.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.
Smyth, Gordon K; Altman, Naomi S
2013-05-26
Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
NASA Technical Reports Server (NTRS)
Podwysocki, M. H.
1976-01-01
A study was made of the field size distributions for LACIE test sites 5029, 5033, and 5039, People's Republic of China. Field lengths and widths were measured from LANDSAT imagery, and field area was statistically modeled. Field size parameters have log-normal or Poisson frequency distributions. These were normalized to the Gaussian distribution and theoretical population curves were made. When compared to fields in other areas of the same country measured in the previous study, field lengths and widths in the three LACIE test sites were 2 to 3 times smaller and areas were smaller by an order of magnitude.
Limpert, Eckhard; Stahel, Werner A.
2011-01-01
Background The Gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by ± SD, or with the standard error of the mean, ± SEM. This, together with corresponding bars in graphical displays has become the standard to characterize variation. Methodology/Principal Findings Here we question the adequacy of this characterization, and of the model. The published literature provides numerous examples for which such descriptions appear inappropriate because, based on the “95% range check”, their distributions are obviously skewed. In these cases, the symmetric characterization is a poor description and may trigger wrong conclusions. To solve the problem, it is enlightening to regard causes of variation. Multiplicative causes are by far more important than additive ones, in general, and benefit from a multiplicative (or log-) normal approach. Fortunately, quite similar to the normal, the log-normal distribution can now be handled easily and characterized at the level of the original data with the help of both, a new sign, x/, times-divide, and notation. Analogous to ± SD, it connects the multiplicative (or geometric) mean * and the multiplicative standard deviation s* in the form * x/s*, that is advantageous and recommended. Conclusions/Significance The corresponding shift from the symmetric to the asymmetric view will substantially increase both, recognition of data distributions, and interpretation quality. It will allow for savings in sample size that can be considerable. Moreover, this is in line with ethical responsibility. Adequate models will improve concepts and theories, and provide deeper insight into science and life. PMID:21779325
Limpert, Eckhard; Stahel, Werner A
2011-01-01
The gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by mean ± SD, or with the standard error of the mean, mean ± SEM. This, together with corresponding bars in graphical displays has become the standard to characterize variation. Here we question the adequacy of this characterization, and of the model. The published literature provides numerous examples for which such descriptions appear inappropriate because, based on the "95% range check", their distributions are obviously skewed. In these cases, the symmetric characterization is a poor description and may trigger wrong conclusions. To solve the problem, it is enlightening to regard causes of variation. Multiplicative causes are by far more important than additive ones, in general, and benefit from a multiplicative (or log-) normal approach. Fortunately, quite similar to the normal, the log-normal distribution can now be handled easily and characterized at the level of the original data with the help of both, a new sign, x/, times-divide, and notation. Analogous to mean ± SD, it connects the multiplicative (or geometric) mean mean * and the multiplicative standard deviation s* in the form mean * x/s*, that is advantageous and recommended. The corresponding shift from the symmetric to the asymmetric view will substantially increase both, recognition of data distributions, and interpretation quality. It will allow for savings in sample size that can be considerable. Moreover, this is in line with ethical responsibility. Adequate models will improve concepts and theories, and provide deeper insight into science and life.
Quantifying lead-time bias in risk factor studies of cancer through simulation.
Jansen, Rick J; Alexander, Bruce H; Anderson, Kristin E; Church, Timothy R
2013-11-01
Lead-time is inherent in early detection and creates bias in observational studies of screening efficacy, but its potential to bias effect estimates in risk factor studies is not always recognized. We describe a form of this bias that conventional analyses cannot address and develop a model to quantify it. Surveillance Epidemiology and End Results (SEER) data form the basis for estimates of age-specific preclinical incidence, and log-normal distributions describe the preclinical duration distribution. Simulations assume a joint null hypothesis of no effect of either the risk factor or screening on the preclinical incidence of cancer, and then quantify the bias as the risk-factor odds ratio (OR) from this null study. This bias can be used as a factor to adjust observed OR in the actual study. For this particular study design, as average preclinical duration increased, the bias in the total-physical activity OR monotonically increased from 1% to 22% above the null, but the smoking OR monotonically decreased from 1% above the null to 5% below the null. The finding of nontrivial bias in fixed risk-factor effect estimates demonstrates the importance of quantitatively evaluating it in susceptible studies. Copyright © 2013 Elsevier Inc. All rights reserved.
Stochastic approach to the derivation of emission limits for wastewater treatment plants.
Stransky, D; Kabelkova, I; Bares, V
2009-01-01
Stochastic approach to the derivation of WWTP emission limits meeting probabilistically defined environmental quality standards (EQS) is presented. The stochastic model is based on the mixing equation with input data defined by probability density distributions and solved by Monte Carlo simulations. The approach was tested on a study catchment for total phosphorus (P(tot)). The model assumes input variables independency which was proved for the dry-weather situation. Discharges and P(tot) concentrations both in the study creek and WWTP effluent follow log-normal probability distribution. Variation coefficients of P(tot) concentrations differ considerably along the stream (c(v)=0.415-0.884). The selected value of the variation coefficient (c(v)=0.420) affects the derived mean value (C(mean)=0.13 mg/l) of the P(tot) EQS (C(90)=0.2 mg/l). Even after supposed improvement of water quality upstream of the WWTP to the level of the P(tot) EQS, the WWTP emission limits calculated would be lower than the values of the best available technology (BAT). Thus, minimum dilution ratios for the meaningful application of the combined approach to the derivation of P(tot) emission limits for Czech streams are discussed.
NASA Astrophysics Data System (ADS)
Gernez, Pierre; Stramski, Dariusz; Darecki, Miroslaw
2011-07-01
Time series measurements of fluctuations in underwater downward irradiance, Ed, within the green spectral band (532 nm) show that the probability distribution of instantaneous irradiance varies greatly as a function of depth within the near-surface ocean under sunny conditions. Because of intense light flashes caused by surface wave focusing, the near-surface probability distributions are highly skewed to the right and are heavy tailed. The coefficients of skewness and excess kurtosis at depths smaller than 1 m can exceed 3 and 20, respectively. We tested several probability models, such as lognormal, Gumbel, Fréchet, log-logistic, and Pareto, which are potentially suited to describe the highly skewed heavy-tailed distributions. We found that the models cannot approximate with consistently good accuracy the high irradiance values within the right tail of the experimental distribution where the probability of these values is less than 10%. This portion of the distribution corresponds approximately to light flashes with Ed > 1.5?, where ? is the time-averaged downward irradiance. However, the remaining part of the probability distribution covering all irradiance values smaller than the 90th percentile can be described with a reasonable accuracy (i.e., within 20%) with a lognormal model for all 86 measurements from the top 10 m of the ocean included in this analysis. As the intensity of irradiance fluctuations decreases with depth, the probability distribution tends toward a function symmetrical around the mean like the normal distribution. For the examined data set, the skewness and excess kurtosis assumed values very close to zero at a depth of about 10 m.
Lima, Robson B DE; Bufalino, Lina; Alves, Francisco T; Silva, José A A DA; Ferreira, Rinaldo L C
2017-01-01
Currently, there is a lack of studies on the correct utilization of continuous distributions for dry tropical forests. Therefore, this work aims to investigate the diameter structure of a brazilian tropical dry forest and to select suitable continuous distributions by means of statistic tools for the stand and the main species. Two subsets were randomly selected from 40 plots. Diameter at base height was obtained. The following functions were tested: log-normal; gamma; Weibull 2P and Burr. The best fits were selected by Akaike's information validation criterion. Overall, the diameter distribution of the dry tropical forest was better described by negative exponential curves and positive skewness. The forest studied showed diameter distributions with decreasing probability for larger trees. This behavior was observed for both the main species and the stand. The generalization of the function fitted for the main species show that the development of individual models is needed. The Burr function showed good flexibility to describe the diameter structure of the stand and the behavior of Mimosa ophthalmocentra and Bauhinia cheilantha species. For Poincianella bracteosa, Aspidosperma pyrifolium and Myracrodum urundeuva better fitting was obtained with the log-normal function.
NASA Astrophysics Data System (ADS)
Maccone, C.
In this paper is provided the statistical generalization of the Fermi paradox. The statistics of habitable planets may be based on a set of ten (and possibly more) astrobiological requirements first pointed out by Stephen H. Dole in his book Habitable planets for man (1964). The statistical generalization of the original and by now too simplistic Dole equation is provided by replacing a product of ten positive numbers by the product of ten positive random variables. This is denoted the SEH, an acronym standing for “Statistical Equation for Habitables”. The proof in this paper is based on the Central Limit Theorem (CLT) of Statistics, stating that the sum of any number of independent random variables, each of which may be ARBITRARILY distributed, approaches a Gaussian (i.e. normal) random variable (Lyapunov form of the CLT). It is then shown that: 1. The new random variable NHab, yielding the number of habitables (i.e. habitable planets) in the Galaxy, follows the log- normal distribution. By construction, the mean value of this log-normal distribution is the total number of habitable planets as given by the statistical Dole equation. 2. The ten (or more) astrobiological factors are now positive random variables. The probability distribution of each random variable may be arbitrary. The CLT in the so-called Lyapunov or Lindeberg forms (that both do not assume the factors to be identically distributed) allows for that. In other words, the CLT "translates" into the SEH by allowing an arbitrary probability distribution for each factor. This is both astrobiologically realistic and useful for any further investigations. 3. By applying the SEH it is shown that the (average) distance between any two nearby habitable planets in the Galaxy may be shown to be inversely proportional to the cubic root of NHab. This distance is denoted by new random variable D. The relevant probability density function is derived, which was named the "Maccone distribution" by Paul Davies in 2008. 4. A practical example is then given of how the SEH works numerically. Each of the ten random variables is uniformly distributed around its own mean value as given by Dole (1964) and a standard deviation of 10% is assumed. The conclusion is that the average number of habitable planets in the Galaxy should be around 100 million ±200 million, and the average distance in between any two nearby habitable planets should be about 88 light years ±40 light years. 5. The SEH results are matched against the results of the Statistical Drake Equation from reference 4. As expected, the number of currently communicating ET civilizations in the Galaxy turns out to be much smaller than the number of habitable planets (about 10,000 against 100 million, i.e. one ET civilization out of 10,000 habitable planets). The average distance between any two nearby habitable planets is much smaller that the average distance between any two neighbouring ET civilizations: 88 light years vs. 2000 light years, respectively. This means an ET average distance about 20 times higher than the average distance between any pair of adjacent habitable planets. 6. Finally, a statistical model of the Fermi Paradox is derived by applying the above results to the coral expansion model of Galactic colonization. The symbolic manipulator "Macsyma" is used to solve these difficult equations. A new random variable Tcol, representing the time needed to colonize a new planet is introduced, which follows the lognormal distribution, Then the new quotient random variable Tcol/D is studied and its probability density function is derived by Macsyma. Finally a linear transformation of random variables yields the overall time TGalaxy needed to colonize the whole Galaxy. We believe that our mathematical work in deriving this STATISTICAL Fermi Paradox is highly innovative and fruitful for the future.
Beyond the power law: Uncovering stylized facts in interbank networks
NASA Astrophysics Data System (ADS)
Vandermarliere, Benjamin; Karas, Alexei; Ryckebusch, Jan; Schoors, Koen
2015-06-01
We use daily data on bilateral interbank exposures and monthly bank balance sheets to study network characteristics of the Russian interbank market over August 1998-October 2004. Specifically, we examine the distributions of (un)directed (un)weighted degree, nodal attributes (bank assets, capital and capital-to-assets ratio) and edge weights (loan size and counterparty exposure). We search for the theoretical distribution that fits the data best and report the "best" fit parameters. We observe that all studied distributions are heavy tailed. The fat tail typically contains 20% of the data and can be mostly described well by a truncated power law. Also the power law, stretched exponential and log-normal provide reasonably good fits to the tails of the data. In most cases, however, separating the bulk and tail parts of the data is hard, so we proceed to study the full range of the events. We find that the stretched exponential and the log-normal distributions fit the full range of the data best. These conclusions are robust to (1) whether we aggregate the data over a week, month, quarter or year; (2) whether we look at the "growth" versus "maturity" phases of interbank market development; and (3) with minor exceptions, whether we look at the "normal" versus "crisis" operation periods. In line with prior research, we find that the network topology changes greatly as the interbank market moves from a "normal" to a "crisis" operation period.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.
2013-01-01
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
NASA Astrophysics Data System (ADS)
Yamada, Yuhei; Yamazaki, Yoshihiro
2018-04-01
This study considered a stochastic model for cluster growth in a Markov process with a cluster size dependent additive noise. According to this model, the probability distribution of the cluster size transiently becomes an exponential or a log-normal distribution depending on the initial condition of the growth. In this letter, a master equation is obtained for this model, and derivation of the distributions is discussed.
ERIC Educational Resources Information Center
Kelava, Augustin; Nagengast, Benjamin
2012-01-01
Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we present a Bayesian model for the estimation of latent nonlinear effects when the latent…
Including operational data in QMRA model: development and impact of model inputs.
Jaidi, Kenza; Barbeau, Benoit; Carrière, Annie; Desjardins, Raymond; Prévost, Michèle
2009-03-01
A Monte Carlo model, based on the Quantitative Microbial Risk Analysis approach (QMRA), has been developed to assess the relative risks of infection associated with the presence of Cryptosporidium and Giardia in drinking water. The impact of various approaches for modelling the initial parameters of the model on the final risk assessments is evaluated. The Monte Carlo simulations that we performed showed that the occurrence of parasites in raw water was best described by a mixed distribution: log-Normal for concentrations > detection limit (DL), and a uniform distribution for concentrations < DL. The selection of process performance distributions for modelling the performance of treatment (filtration and ozonation) influences the estimated risks significantly. The mean annual risks for conventional treatment are: 1.97E-03 (removal credit adjusted by log parasite = log spores), 1.58E-05 (log parasite = 1.7 x log spores) or 9.33E-03 (regulatory credits based on the turbidity measurement in filtered water). Using full scale validated SCADA data, the simplified calculation of CT performed at the plant was shown to largely underestimate the risk relative to a more detailed CT calculation, which takes into consideration the downtime and system failure events identified at the plant (1.46E-03 vs. 3.93E-02 for the mean risk).
Performance of statistical models to predict mental health and substance abuse cost.
Montez-Rath, Maria; Christiansen, Cindy L; Ettner, Susan L; Loveland, Susan; Rosen, Amy K
2006-10-26
Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.
Bayman, Emine O; Chaloner, Kathryn M; Hindman, Bradley J; Todd, Michael M
2013-01-16
To quantify the variability among centers and to identify centers whose performance are potentially outside of normal variability in the primary outcome and to propose a guideline that they are outliers. Novel statistical methodology using a Bayesian hierarchical model is used. Bayesian methods for estimation and outlier detection are applied assuming an additive random center effect on the log odds of response: centers are similar but different (exchangeable). The Intraoperative Hypothermia for Aneurysm Surgery Trial (IHAST) is used as an example. Analyses were adjusted for treatment, age, gender, aneurysm location, World Federation of Neurological Surgeons scale, Fisher score and baseline NIH stroke scale scores. Adjustments for differences in center characteristics were also examined. Graphical and numerical summaries of the between-center standard deviation (sd) and variability, as well as the identification of potential outliers are implemented. In the IHAST, the center-to-center variation in the log odds of favorable outcome at each center is consistent with a normal distribution with posterior sd of 0.538 (95% credible interval: 0.397 to 0.726) after adjusting for the effects of important covariates. Outcome differences among centers show no outlying centers. Four potential outlying centers were identified but did not meet the proposed guideline for declaring them as outlying. Center characteristics (number of subjects enrolled from the center, geographical location, learning over time, nitrous oxide, and temporary clipping use) did not predict outcome, but subject and disease characteristics did. Bayesian hierarchical methods allow for determination of whether outcomes from a specific center differ from others and whether specific clinical practices predict outcome, even when some centers/subgroups have relatively small sample sizes. In the IHAST no outlying centers were found. The estimated variability between centers was moderately large.
The broad-band SEDs of four `hypervariable' AGN
NASA Astrophysics Data System (ADS)
Collinson, James S.; Ward, Martin J.; Lawrence, Andy; Bruce, Alastair; MacLeod, Chelsea L.; Elvis, Martin; Gezari, Suvi; Marshall, Philip J.; Done, Chris
2018-03-01
We present an optical-to-X-ray spectral analysis of four `hypervariable' AGN (HVAs) discovered by comparing Pan-STARRS data to that from the Sloan Digital Sky Survey over a 10 yr baseline (Lawrence et al.). There is some evidence that these objects are X-ray loud for their corresponding UV luminosities, but given that we measured them in a historic high state, it is not clear whether to take the high state or low state as typical of the properties of these HVAs. We estimate black hole masses based on Mg II and H α emission line profiles, and either the high- or low-state luminosities, finding mass ranges log (MBH/M⊙) = 8.2-8.8 and log (MBH/M⊙) = 7.9-8.3, respectively. We then fit energy-conserving models to the spectral energy distributions (SEDs), obtaining strong constraints on the bolometric luminosity and αOX. We compare the SED properties with a larger, X-ray selected AGN sample for both of these scenarios, and observe distinct groupings in spectral shape versus luminosity parameter space. In general, the SED properties are closer to normal if we assume that the low state is representative. This supports the idea that the large slow outbursts may be due to extrinsic effects (for example microlensing) as opposed to accretion rate changes, but a larger sample of HVAs is needed to be confident of this conclusion.
Income distribution dependence of poverty measure: A theoretical analysis
NASA Astrophysics Data System (ADS)
Chattopadhyay, Amit K.; Mallick, Sushanta K.
2007-04-01
Using a modified deprivation (or poverty) function, in this paper, we theoretically study the changes in poverty with respect to the ‘global’ mean and variance of the income distribution using Indian survey data. We show that when the income obeys a log-normal distribution, a rising mean income generally indicates a reduction in poverty while an increase in the variance of the income distribution increases poverty. This altruistic view for a developing economy, however, is not tenable anymore once the poverty index is found to follow a pareto distribution. Here although a rising mean income indicates a reduction in poverty, due to the presence of an inflexion point in the poverty function, there is a critical value of the variance below which poverty decreases with increasing variance while beyond this value, poverty undergoes a steep increase followed by a decrease with respect to higher variance. Identifying this inflexion point as the poverty line, we show that the pareto poverty function satisfies all three standard axioms of a poverty index [N.C. Kakwani, Econometrica 43 (1980) 437; A.K. Sen, Econometrica 44 (1976) 219] whereas the log-normal distribution falls short of this requisite. Following these results, we make quantitative predictions to correlate a developing with a developed economy.
Tahir, M Ramzan; Tran, Quang X; Nikulin, Mikhail S
2017-05-30
We studied the problem of testing a hypothesized distribution in survival regression models when the data is right censored and survival times are influenced by covariates. A modified chi-squared type test, known as Nikulin-Rao-Robson statistic, is applied for the comparison of accelerated failure time models. This statistic is used to test the goodness-of-fit for hypertabastic survival model and four other unimodal hazard rate functions. The results of simulation study showed that the hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution. In statistical modeling, because of its flexible shape of hazard functions, this distribution can also be used as a competitor of Birnbaum-Saunders and inverse Gaussian distributions. The results for the real data application are shown. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muñoz-Jaramillo, Andrés; Windmueller, John C.; Amouzou, Ernest C.
2015-02-10
In this work, we take advantage of 11 different sunspot group, sunspot, and active region databases to characterize the area and flux distributions of photospheric magnetic structures. We find that, when taken separately, different databases are better fitted by different distributions (as has been reported previously in the literature). However, we find that all our databases can be reconciled by the simple application of a proportionality constant, and that, in reality, different databases are sampling different parts of a composite distribution. This composite distribution is made up by linear combination of Weibull and log-normal distributions—where a pure Weibull (log-normal) characterizesmore » the distribution of structures with fluxes below (above) 10{sup 21}Mx (10{sup 22}Mx). Additionally, we demonstrate that the Weibull distribution shows the expected linear behavior of a power-law distribution (when extended to smaller fluxes), making our results compatible with the results of Parnell et al. We propose that this is evidence of two separate mechanisms giving rise to visible structures on the photosphere: one directly connected to the global component of the dynamo (and the generation of bipolar active regions), and the other with the small-scale component of the dynamo (and the fragmentation of magnetic structures due to their interaction with turbulent convection)« less
Probabilistic structural analysis of a truss typical for space station
NASA Technical Reports Server (NTRS)
Pai, Shantaram S.
1990-01-01
A three-bay, space, cantilever truss is probabilistically evaluated using the computer code NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) to identify and quantify the uncertainties and respective sensitivities associated with corresponding uncertainties in the primitive variables (structural, material, and loads parameters) that defines the truss. The distribution of each of these primitive variables is described in terms of one of several available distributions such as the Weibull, exponential, normal, log-normal, etc. The cumulative distribution function (CDF's) for the response functions considered and sensitivities associated with the primitive variables for given response are investigated. These sensitivities help in determining the dominating primitive variables for that response.
Probabilistic properties of wavelets in kinetic surface roughening
NASA Astrophysics Data System (ADS)
Bershadskii, A.
2001-08-01
Using the data of a recent numerical simulation [M. Ahr and M. Biehl, Phys. Rev. E 62, 1773 (2000)] of homoepitaxial growth it is shown that the observed probability distribution of a wavelet based measure of the growing surface roughness is consistent with a stretched log-normal distribution and the corresponding branching dimension depends on the level of particle desorption.
NASA Astrophysics Data System (ADS)
Biteau, J.; Giebels, B.
2012-12-01
Very high energy gamma-ray variability of blazar emission remains of puzzling origin. Fast flux variations down to the minute time scale, as observed with H.E.S.S. during flares of the blazar PKS 2155-304, suggests that variability originates from the jet, where Doppler boosting can be invoked to relax causal constraints on the size of the emission region. The observation of log-normality in the flux distributions should rule out additive processes, such as those resulting from uncorrelated multiple-zone emission models, and favour an origin of the variability from multiplicative processes not unlike those observed in a broad class of accreting systems. We show, using a simple kinematic model, that Doppler boosting of randomly oriented emitting regions generates flux distributions following a Pareto law, that the linear flux-r.m.s. relation found for a single zone holds for a large number of emitting regions, and that the skewed distribution of the total flux is close to a log-normal, despite arising from an additive process.
Coverage dependent molecular assembly of anthraquinone on Au(111)
NASA Astrophysics Data System (ADS)
DeLoach, Andrew S.; Conrad, Brad R.; Einstein, T. L.; Dougherty, Daniel B.
2017-11-01
A scanning tunneling microscopy study of anthraquinone (AQ) on the Au(111) surface shows that the molecules self-assemble into several structures depending on the local surface coverage. At high coverages, a close-packed saturated monolayer is observed, while at low coverages, mobile surface molecules coexist with stable chiral hexamer clusters. At intermediate coverages, a disordered 2D porous network interlinking close-packed islands is observed in contrast to the giant honeycomb networks observed for the same molecule on Cu(111). This difference verifies the predicted extreme sensitivity [J. Wyrick et al., Nano Lett. 11, 2944 (2011)] of the pore network to small changes in the surface electronic structure. Quantitative analysis of the 2D pore network reveals that the areas of the vacancy islands are distributed log-normally. Log-normal distributions are typically associated with the product of random variables (multiplicative noise), and we propose that the distribution of pore sizes for AQ on Au(111) originates from random linear rate constants for molecules to either desorb from the surface or detach from the region of a nucleated pore.
Coverage dependent molecular assembly of anthraquinone on Au(111).
DeLoach, Andrew S; Conrad, Brad R; Einstein, T L; Dougherty, Daniel B
2017-11-14
A scanning tunneling microscopy study of anthraquinone (AQ) on the Au(111) surface shows that the molecules self-assemble into several structures depending on the local surface coverage. At high coverages, a close-packed saturated monolayer is observed, while at low coverages, mobile surface molecules coexist with stable chiral hexamer clusters. At intermediate coverages, a disordered 2D porous network interlinking close-packed islands is observed in contrast to the giant honeycomb networks observed for the same molecule on Cu(111). This difference verifies the predicted extreme sensitivity [J. Wyrick et al., Nano Lett. 11, 2944 (2011)] of the pore network to small changes in the surface electronic structure. Quantitative analysis of the 2D pore network reveals that the areas of the vacancy islands are distributed log-normally. Log-normal distributions are typically associated with the product of random variables (multiplicative noise), and we propose that the distribution of pore sizes for AQ on Au(111) originates from random linear rate constants for molecules to either desorb from the surface or detach from the region of a nucleated pore.
Comparison of Two Methods Used to Model Shape Parameters of Pareto Distributions
Liu, C.; Charpentier, R.R.; Su, J.
2011-01-01
Two methods are compared for estimating the shape parameters of Pareto field-size (or pool-size) distributions for petroleum resource assessment. Both methods assume mature exploration in which most of the larger fields have been discovered. Both methods use the sizes of larger discovered fields to estimate the numbers and sizes of smaller fields: (1) the tail-truncated method uses a plot of field size versus size rank, and (2) the log-geometric method uses data binned in field-size classes and the ratios of adjacent bin counts. Simulation experiments were conducted using discovered oil and gas pool-size distributions from four petroleum systems in Alberta, Canada and using Pareto distributions generated by Monte Carlo simulation. The estimates of the shape parameters of the Pareto distributions, calculated by both the tail-truncated and log-geometric methods, generally stabilize where discovered pool numbers are greater than 100. However, with fewer than 100 discoveries, these estimates can vary greatly with each new discovery. The estimated shape parameters of the tail-truncated method are more stable and larger than those of the log-geometric method where the number of discovered pools is more than 100. Both methods, however, tend to underestimate the shape parameter. Monte Carlo simulation was also used to create sequences of discovered pool sizes by sampling from a Pareto distribution with a discovery process model using a defined exploration efficiency (in order to show how biased the sampling was in favor of larger fields being discovered first). A higher (more biased) exploration efficiency gives better estimates of the Pareto shape parameters. ?? 2011 International Association for Mathematical Geosciences.
Predicting surface vibration from underground railways through inhomogeneous soil
NASA Astrophysics Data System (ADS)
Jones, Simon; Hunt, Hugh
2012-04-01
Noise and vibration from underground railways is a major source of disturbance to inhabitants near subways. To help designers meet noise and vibration limits, numerical models are used to understand vibration propagation from these underground railways. However, the models commonly assume the ground is homogeneous and neglect to include local variability in the soil properties. Such simplifying assumptions add a level of uncertainty to the predictions which is not well understood. The goal of the current paper is to quantify the effect of soil inhomogeneity on surface vibration. The thin-layer method (TLM) is suggested as an efficient and accurate means of simulating vibration from underground railways in arbitrarily layered half-spaces. Stochastic variability of the soil's elastic modulus is introduced using a K-L expansion; the modulus is assumed to have a log-normal distribution and a modified exponential covariance kernel. The effect of horizontal soil variability is investigated by comparing the stochastic results for soils varied only in the vertical direction to soils with 2D variability. Results suggest that local soil inhomogeneity can significantly affect surface velocity predictions; 90 percent confidence intervals showing 8 dB averages and peak values up to 12 dB are computed. This is a significant source of uncertainty and should be considered when using predictions from models assuming homogeneous soil properties. Furthermore, the effect of horizontal variability of the elastic modulus on the confidence interval appears to be negligible. This suggests that only vertical variation needs to be taken into account when modelling ground vibration from underground railways.
Methane Leaks from Natural Gas Systems Follow Extreme Distributions.
Brandt, Adam R; Heath, Garvin A; Cooley, Daniel
2016-11-15
Future energy systems may rely on natural gas as a low-cost fuel to support variable renewable power. However, leaking natural gas causes climate damage because methane (CH 4 ) has a high global warming potential. In this study, we use extreme-value theory to explore the distribution of natural gas leak sizes. By analyzing ∼15 000 measurements from 18 prior studies, we show that all available natural gas leakage data sets are statistically heavy-tailed, and that gas leaks are more extremely distributed than other natural and social phenomena. A unifying result is that the largest 5% of leaks typically contribute over 50% of the total leakage volume. While prior studies used log-normal model distributions, we show that log-normal functions poorly represent tail behavior. Our results suggest that published uncertainty ranges of CH 4 emissions are too narrow, and that larger sample sizes are required in future studies to achieve targeted confidence intervals. Additionally, we find that cross-study aggregation of data sets to increase sample size is not recommended due to apparent deviation between sampled populations. Understanding the nature of leak distributions can improve emission estimates, better illustrate their uncertainty, allow prioritization of source categories, and improve sampling design. Also, these data can be used for more effective design of leak detection technologies.
Predicting clicks of PubMed articles.
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed.
Predicting clicks of PubMed articles
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed. PMID:24551386
A Bayesian model for time-to-event data with informative censoring
Kaciroti, Niko A.; Raghunathan, Trivellore E.; Taylor, Jeremy M. G.; Julius, Stevo
2012-01-01
Randomized trials with dropouts or censored data and discrete time-to-event type outcomes are frequently analyzed using the Kaplan–Meier or product limit (PL) estimation method. However, the PL method assumes that the censoring mechanism is noninformative and when this assumption is violated, the inferences may not be valid. We propose an expanded PL method using a Bayesian framework to incorporate informative censoring mechanism and perform sensitivity analysis on estimates of the cumulative incidence curves. The expanded method uses a model, which can be viewed as a pattern mixture model, where odds for having an event during the follow-up interval (tk−1,tk], conditional on being at risk at tk−1, differ across the patterns of missing data. The sensitivity parameters relate the odds of an event, between subjects from a missing-data pattern with the observed subjects for each interval. The large number of the sensitivity parameters is reduced by considering them as random and assumed to follow a log-normal distribution with prespecified mean and variance. Then we vary the mean and variance to explore sensitivity of inferences. The missing at random (MAR) mechanism is a special case of the expanded model, thus allowing exploration of the sensitivity to inferences as departures from the inferences under the MAR assumption. The proposed approach is applied to data from the TRial Of Preventing HYpertension. PMID:22223746
Statistical distribution of building lot frontage: application for Tokyo downtown districts
NASA Astrophysics Data System (ADS)
Usui, Hiroyuki
2018-03-01
The frontage of a building lot is the determinant factor of the residential environment. The statistical distribution of building lot frontages shows how the perimeters of urban blocks are shared by building lots for a given density of buildings and roads. For practitioners in urban planning, this is indispensable to identify potential districts which comprise a high percentage of building lots with narrow frontage after subdivision and to reconsider the appropriate criteria for the density of buildings and roads as residential environment indices. In the literature, however, the statistical distribution of building lot frontages and the density of buildings and roads has not been fully researched. In this paper, based on the empirical study in the downtown districts of Tokyo, it is found that (1) a log-normal distribution fits the observed distribution of building lot frontages better than a gamma distribution, which is the model of the size distribution of Poisson Voronoi cells on closed curves; (2) the statistical distribution of building lot frontages statistically follows a log-normal distribution, whose parameters are the gross building density, road density, average road width, the coefficient of variation of building lot frontage, and the ratio of the number of building lot frontages to the number of buildings; and (3) the values of the coefficient of variation of building lot frontages, and that of the ratio of the number of building lot frontages to that of buildings are approximately equal to 0.60 and 1.19, respectively.
Fatigue shifts and scatters heart rate variability in elite endurance athletes.
Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire
2013-01-01
This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in 'fatigue' or in 'no-fatigue' state in 'real life' conditions. 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms(2) and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). 172 trials were identified as in a 'fatigue' and 891 as in 'no-fatigue' state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between 'fatigue' and 'no-fatigue': HRSU (+6.27±0.61 bpm), logTPSU (-0.36±0.04), logLFSU (-0.27±0.04), logHFSU (-0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (-9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (-0.28±0.03), logLFST (-0.29±0.03), logHFST (-0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the 'fatigue' state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern.
Log-normal spray drop distribution...analyzed by two new computer programs
Gerald S. Walton
1968-01-01
Results of U.S. Forest Service research on chemical insecticides suggest that large drops are not as effective as small drops in carrying insecticides to target insects. Two new computer programs have been written to analyze size distribution properties of drops from spray nozzles. Coded in Fortran IV, the programs have been tested on both the CDC 6400 and the IBM 7094...
Radium-226 content of beverages
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kiefer, J.
Radium contents of commercially obtained beer, wine, milk and mineral waters were measured. All distributions were log-normal with the following geometrical mean values: beer: 2.1 X 10(-2) Bq L-1; wine: 3.4 X 10(-2) Bq L-1; milk: 3 X 10(-3) Bq L-1; normal mineral water: 4.3 X 10(-2) L-1; medical mineral water: 9.4 X 10(-2) Bq L-1.
Investigation into the performance of different models for predicting stutter.
Bright, Jo-Anne; Curran, James M; Buckleton, John S
2013-07-01
In this paper we have examined five possible models for the behaviour of the stutter ratio, SR. These were two log-normal models, two gamma models, and a two-component normal mixture model. A two-component normal mixture model was chosen with different behaviours of variance; at each locus SR was described with two distributions, both with the same mean. The distributions have difference variances: one for the majority of the observations and a second for the less well-behaved ones. We apply each model to a set of known single source Identifiler™, NGM SElect™ and PowerPlex(®) 21 DNA profiles to show the applicability of our findings to different data sets. SR determined from the single source profiles were compared to the calculated SR after application of the models. The model performance was tested by calculating the log-likelihoods and comparing the difference in Akaike information criterion (AIC). The two-component normal mixture model systematically outperformed all others, despite the increase in the number of parameters. This model, as well as performing well statistically, has intuitive appeal for forensic biologists and could be implemented in an expert system with a continuous method for DNA interpretation. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
PERFLUORINATED COMPOUNDS IN ARCHIVED HOUSE-DUST SAMPLES
Archived house-dust samples were analyzed for 13 perfluorinated compounds (PFCs). Results show that PFCs are found in house-dust samples, and the data are log-normally distributed. PFOS/PFOA were present in 94.6% and 96.4% of the samples respectively. Concentrations ranged fro...
Robust Bayesian Factor Analysis
ERIC Educational Resources Information Center
Hayashi, Kentaro; Yuan, Ke-Hai
2003-01-01
Bayesian factor analysis (BFA) assumes the normal distribution of the current sample conditional on the parameters. Practical data in social and behavioral sciences typically have significant skewness and kurtosis. If the normality assumption is not attainable, the posterior analysis will be inaccurate, although the BFA depends less on the current…
NASA Technical Reports Server (NTRS)
Leybold, H. A.
1971-01-01
Random numbers were generated with the aid of a digital computer and transformed such that the probability density function of a discrete random load history composed of these random numbers had one of the following non-Gaussian distributions: Poisson, binomial, log-normal, Weibull, and exponential. The resulting random load histories were analyzed to determine their peak statistics and were compared with cumulative peak maneuver-load distributions for fighter and transport aircraft in flight.
Statistical Considerations of Data Processing in Giovanni Online Tool
NASA Technical Reports Server (NTRS)
Suhung, Shen; Leptoukh, G.; Acker, J.; Berrick, S.
2005-01-01
The GES DISC Interactive Online Visualization and Analysis Infrastructure (Giovanni) is a web-based interface for the rapid visualization and analysis of gridded data from a number of remote sensing instruments. The GES DISC currently employs several Giovanni instances to analyze various products, such as Ocean-Giovanni for ocean products from SeaWiFS and MODIS-Aqua; TOMS & OM1 Giovanni for atmospheric chemical trace gases from TOMS and OMI, and MOVAS for aerosols from MODIS, etc. (http://giovanni.gsfc.nasa.gov) Foremost among the Giovanni statistical functions is data averaging. Two aspects of this function are addressed here. The first deals with the accuracy of averaging gridded mapped products vs. averaging from the ungridded Level 2 data. Some mapped products contain mean values only; others contain additional statistics, such as number of pixels (NP) for each grid, standard deviation, etc. Since NP varies spatially and temporally, averaging with or without weighting by NP will be different. In this paper, we address differences of various weighting algorithms for some datasets utilized in Giovanni. The second aspect is related to different averaging methods affecting data quality and interpretation for data with non-normal distribution. The present study demonstrates results of different spatial averaging methods using gridded SeaWiFS Level 3 mapped monthly chlorophyll a data. Spatial averages were calculated using three different methods: arithmetic mean (AVG), geometric mean (GEO), and maximum likelihood estimator (MLE). Biogeochemical data, such as chlorophyll a, are usually considered to have a log-normal distribution. The study determined that differences between methods tend to increase with increasing size of a selected coastal area, with no significant differences in most open oceans. The GEO method consistently produces values lower than AVG and MLE. The AVG method produces values larger than MLE in some cases, but smaller in other cases. Further studies indicated that significant differences between AVG and MLE methods occurred in coastal areas where data have large spatial variations and a log-bimodal distribution instead of log-normal distribution.
Schlain, Brian; Amaravadi, Lakshmi; Donley, Jean; Wickramasekera, Ananda; Bennett, Donald; Subramanyam, Meena
2010-01-31
In recent years there has been growing recognition of the impact of anti-drug or anti-therapeutic antibodies (ADAs, ATAs) on the pharmacokinetic and pharmacodynamic behavior of the drug, which ultimately affects drug exposure and activity. These anti-drug antibodies can also impact safety of the therapeutic by inducing a range of reactions from hypersensitivity to neutralization of the activity of an endogenous protein. Assessments of immunogenicity, therefore, are critically dependent on the bioanalytical method used to test samples, in which a positive versus negative reactivity is determined by a statistically derived cut point based on the distribution of drug naïve samples. For non-normally distributed data, a novel gamma-fitting method for obtaining assay cut points is presented. Non-normal immunogenicity data distributions, which tend to be unimodal and positively skewed, can often be modeled by 3-parameter gamma fits. Under a gamma regime, gamma based cut points were found to be more accurate (closer to their targeted false positive rates) compared to normal or log-normal methods and more precise (smaller standard errors of cut point estimators) compared with the nonparametric percentile method. Under a gamma regime, normal theory based methods for estimating cut points targeting a 5% false positive rate were found in computer simulation experiments to have, on average, false positive rates ranging from 6.2 to 8.3% (or positive biases between +1.2 and +3.3%) with bias decreasing with the magnitude of the gamma shape parameter. The log-normal fits tended, on average, to underestimate false positive rates with negative biases as large a -2.3% with absolute bias decreasing with the shape parameter. These results were consistent with the well known fact that gamma distributions become less skewed and closer to a normal distribution as their shape parameters increase. Inflated false positive rates, especially in a screening assay, shifts the emphasis to confirm test results in a subsequent test (confirmatory assay). On the other hand, deflated false positive rates in the case of screening immunogenicity assays will not meet the minimum 5% false positive target as proposed in the immunogenicity assay guidance white papers. Copyright 2009 Elsevier B.V. All rights reserved.
Speech Enhancement Using Gaussian Scale Mixture Models
Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.
2011-01-01
This paper presents a novel probabilistic approach to speech enhancement. Instead of a deterministic logarithmic relationship, we assume a probabilistic relationship between the frequency coefficients and the log-spectra. The speech model in the log-spectral domain is a Gaussian mixture model (GMM). The frequency coefficients obey a zero-mean Gaussian whose covariance equals to the exponential of the log-spectra. This results in a Gaussian scale mixture model (GSMM) for the speech signal in the frequency domain, since the log-spectra can be regarded as scaling factors. The probabilistic relation between frequency coefficients and log-spectra allows these to be treated as two random variables, both to be estimated from the noisy signals. Expectation-maximization (EM) was used to train the GSMM and Bayesian inference was used to compute the posterior signal distribution. Because exact inference of this full probabilistic model is computationally intractable, we developed two approaches to enhance the efficiency: the Laplace method and a variational approximation. The proposed methods were applied to enhance speech corrupted by Gaussian noise and speech-shaped noise (SSN). For both approximations, signals reconstructed from the estimated frequency coefficients provided higher signal-to-noise ratio (SNR) and those reconstructed from the estimated log-spectra produced lower word recognition error rate because the log-spectra fit the inputs to the recognizer better. Our algorithms effectively reduced the SSN, which algorithms based on spectral analysis were not able to suppress. PMID:21359139
Statistical study of the reliability of oxide-defined stripe cw lasers of (AlGa)As
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ettenberg, M.
1979-03-01
In this report, we describe a statistical study of the reliability of oxide-defined stripe-contact cw injection lasers of (AlGa)As. These devices have one facet coated with Al/sub 2/O/sub 3/ and one facet coated with an Al/sub 2/O/sub 3//Si dichroic reflector; the lasers are optimized for cw low-threshold currents at room temperature, with values typically about 50 mA. Lifetests were carried out at 70 /sup 0/C ambient, in the cw mode of operation with about 5 mW output. Previous lifetests showed that the degradation rate followed a 0.95-eV activation energy so the 70 /sup 0/C environment provides a degradation acceleration factormore » of 190 over that at room temperature. We have found that the device failures follow a log-normal distribution, characterized by a mean time before failure of 4200 h and a standard deviation of 1.3. This corresponds to a mean time to failure (MTTF) of 10/sup 6/ h at room temperature. Failure is defined here as the inability of the device to emit 1 mW of stimulated cw output at 70 /sup 0/C, and assumes that optical feedback will be employed to adjust the laser current during operation. If a constant-current drive is envisioned, the failures for a 3-dB drop in light output also follow a log-normal distribution with a similar slope (standard deviation=1.1) and a MTTF of 2000 h at 70 /sup 0/C (500 000 h at room temperature). The failures were found to be mainly due to bulk gradual degradation and not facet or contact failure. Careful study of lasers before and after lifetest showed a significant increase in contact thermal resistance. However, this increase accounts for only a small portion of the nearly 70% increase in room-temperature cw threshold after failure at 70 /sup 0/C. After failure at 70 /sup 0/C, we also noted a degradation in the near-field and associated far-field pattern of the laser.« less
Percent area coverage through image analysis
NASA Astrophysics Data System (ADS)
Wong, Chung M.; Hong, Sung M.; Liu, De-Ling
2016-09-01
The notion of percent area coverage (PAC) has been used to characterize surface cleanliness levels in the spacecraft contamination control community. Due to the lack of detailed particle data, PAC has been conventionally calculated by multiplying the particle surface density in predetermined particle size bins by a set of coefficients per MIL-STD-1246C. In deriving the set of coefficients, the surface particle size distribution is assumed to follow a log-normal relation between particle density and particle size, while the cross-sectional area function is given as a combination of regular geometric shapes. For particles with irregular shapes, the cross-sectional area function cannot describe the true particle area and, therefore, may introduce error in the PAC calculation. Other errors may also be introduced by using the lognormal surface particle size distribution function that highly depends on the environmental cleanliness and cleaning process. In this paper, we present PAC measurements from silicon witness wafers that collected fallouts from a fabric material after vibration testing. PAC calculations were performed through analysis of microscope images and compare them to values derived through the MIL-STD-1246C method. Our results showed that the MIL-STD-1246C method does provide a reasonable upper bound to the PAC values determined through image analysis, in particular for PAC values below 0.1.
Code of Federal Regulations, 2010 CFR
2010-01-01
... STANDARDS: NORMAL CATEGORY ROTORCRAFT Strength Requirements Flight Loads § 27.321 General. (a) The flight load factor must be assumed to act normal to the longitudinal axis of the rotorcraft, and to be equal... from the design minimum weight to the design maximum weight; and (2) With any practical distribution of...
Kilian, Reinhold; Matschinger, Herbert; Löeffler, Walter; Roick, Christiane; Angermeyer, Matthias C
2002-03-01
Transformation of the dependent cost variable is often used to solve the problems of heteroscedasticity and skewness in linear ordinary least square regression of health service cost data. However, transformation may cause difficulties in the interpretation of regression coefficients and the retransformation of predicted values. The study compares the advantages and disadvantages of different methods to estimate regression based cost functions using data on the annual costs of schizophrenia treatment. Annual costs of psychiatric service use and clinical and socio-demographic characteristics of the patients were assessed for a sample of 254 patients with a diagnosis of schizophrenia (ICD-10 F 20.0) living in Leipzig. The clinical characteristics of the participants were assessed by means of the BPRS 4.0, the GAF, and the CAN for service needs. Quality of life was measured by WHOQOL-BREF. A linear OLS regression model with non-parametric standard errors, a log-transformed OLS model and a generalized linear model with a log-link and a gamma distribution were used to estimate service costs. For the estimation of robust non-parametric standard errors, the variance estimator by White and a bootstrap estimator based on 2000 replications were employed. Models were evaluated by the comparison of the R2 and the root mean squared error (RMSE). RMSE of the log-transformed OLS model was computed with three different methods of bias-correction. The 95% confidence intervals for the differences between the RMSE were computed by means of bootstrapping. A split-sample-cross-validation procedure was used to forecast the costs for the one half of the sample on the basis of a regression equation computed for the other half of the sample. All three methods showed significant positive influences of psychiatric symptoms and met psychiatric service needs on service costs. Only the log- transformed OLS model showed a significant negative impact of age, and only the GLM shows a significant negative influences of employment status and partnership on costs. All three models provided a R2 of about.31. The Residuals of the linear OLS model revealed significant deviances from normality and homoscedasticity. The residuals of the log-transformed model are normally distributed but still heteroscedastic. The linear OLS model provided the lowest prediction error and the best forecast of the dependent cost variable. The log-transformed model provided the lowest RMSE if the heteroscedastic bias correction was used. The RMSE of the GLM with a log link and a gamma distribution was higher than those of the linear OLS model and the log-transformed OLS model. The difference between the RMSE of the linear OLS model and that of the log-transformed OLS model without bias correction was significant at the 95% level. As result of the cross-validation procedure, the linear OLS model provided the lowest RMSE followed by the log-transformed OLS model with a heteroscedastic bias correction. The GLM showed the weakest model fit again. None of the differences between the RMSE resulting form the cross- validation procedure were found to be significant. The comparison of the fit indices of the different regression models revealed that the linear OLS model provided a better fit than the log-transformed model and the GLM, but the differences between the models RMSE were not significant. Due to the small number of cases in the study the lack of significance does not sufficiently proof that the differences between the RSME for the different models are zero and the superiority of the linear OLS model can not be generalized. The lack of significant differences among the alternative estimators may reflect a lack of sample size adequate to detect important differences among the estimators employed. Further studies with larger case number are necessary to confirm the results. Specification of an adequate regression models requires a careful examination of the characteristics of the data. Estimation of standard errors and confidence intervals by nonparametric methods which are robust against deviations from the normal distribution and the homoscedasticity of residuals are suitable alternatives to the transformation of the skew distributed dependent variable. Further studies with more adequate case numbers are needed to confirm the results.
Notes on power of normality tests of error terms in regression models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Střelec, Luboš
2015-03-10
Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importancemore » of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.« less
Analysis of Mount St. Helens ash from optical photoelectric photometry
NASA Technical Reports Server (NTRS)
Cardelli, J. A.; Ackerman, T. P.
1983-01-01
The optical properties of suspended dust particles from the eruption of Mt. St. Helens on July 23, 1980 are investigated using photoelectric observations of standard stars obtained on the 0.76-m telescope at the University of Washington 48 hours after the eruption. Measurements were made with five broad-band filters centered at 3910, 5085, 5480, 6330, and 8050 A on stars of varying color and over a wide range of air masses. Anomalous extinction effects due to the volcanic ash were detected, and a significant change in the wavelength-dependent extinction parameter during the course of the observations was established by statistical analysis. Mean particle size (a) and column density (N) are estimated using the Mie theory, assuming a log-normal particle-size distribution: a = 0.18 micron throughout; N = 1.02 x 10 to the 9th/sq cm before 7:00 UT and 2.33 x 10 to the 9th/sq cm after 8:30 UT on July 25, 1980. The extinction is attributed to low-level, slowly migrating ash, possibly combined with products of gas-to-particle conversion and coagulation.
Cold Milky Way HI Gas in Filaments
NASA Astrophysics Data System (ADS)
Kalberla, P. M. W.; Kerp, J.; Haud, U.; Winkel, B.; Ben Bekhti, N.; Flöer, L.; Lenz, D.
2016-04-01
We investigate data from the Galactic Effelsberg-Bonn H I Survey, supplemented with data from the third release of the Galactic All Sky Survey (GASS III) observed at Parkes. We explore the all-sky distribution of the local Galactic H I gas with | {v}{{LSR}}| \\lt 25 km s-1 on angular scales of 11‧-16‧. Unsharp masking is applied to extract small-scale features. We find cold filaments that are aligned with polarized dust emission and conclude that the cold neutral medium (CNM) is mostly organized in sheets that are, because of projection effects, observed as filaments. These filaments are associated with dust ridges, aligned with the magnetic field measured on the structures by Planck at 353 GHz. The CNM above latitudes | b| \\gt 20^\\circ is described by a log-normal distribution, with a median Doppler temperature TD = 223 K, derived from observed line widths that include turbulent contributions. The median neutral hydrogen (H I) column density is NH I ≃ 1019.1 cm-2. These CNM structures are embedded within a warm neutral medium with NH I ≃ 1020 cm-2. Assuming an average distance of 100 pc, we derive for the CNM sheets a thickness of ≲0.3 pc. Adopting a magnetic field strength of Btot = (6.0 ± 1.8) μG, proposed by Heiles & Troland, and assuming that the CNM filaments are confined by magnetic pressure, we estimate a thickness of 0.09 pc. Correspondingly, the median volume density is in the range 14 ≲ n ≲ 47 cm-3. The authors thank the Deutsche Forschungsgemeinschaft (DFG) for support under grant numbers KE757/11-1, KE757/7-3, KE757/7-2, KE757/7-1, and BE4823/1-1.
Scale-dependency of effective hydraulic conductivity on fire-affected hillslopes
NASA Astrophysics Data System (ADS)
Langhans, Christoph; Lane, Patrick N. J.; Nyman, Petter; Noske, Philip J.; Cawson, Jane G.; Oono, Akiko; Sheridan, Gary J.
2016-07-01
Effective hydraulic conductivity (Ke) for Hortonian overland flow modeling has been defined as a function of rainfall intensity and runon infiltration assuming a distribution of saturated hydraulic conductivities (Ks). But surface boundary condition during infiltration and its interactions with the distribution of Ks are not well represented in models. As a result, the mean value of the Ks distribution (KS¯), which is the central parameter for Ke, varies between scales. Here we quantify this discrepancy with a large infiltration data set comprising four different methods and scales from fire-affected hillslopes in SE Australia using a relatively simple yet widely used conceptual model of Ke. Ponded disk (0.002 m2) and ring infiltrometers (0.07 m2) were used at the small scales and rainfall simulations (3 m2) and small catchments (ca 3000 m2) at the larger scales. We compared KS¯ between methods measured at the same time and place. Disk and ring infiltrometer measurements had on average 4.8 times higher values of KS¯ than rainfall simulations and catchment-scale estimates. Furthermore, the distribution of Ks was not clearly log-normal and scale-independent, as supposed in the conceptual model. In our interpretation, water repellency and preferential flow paths increase the variance of the measured distribution of Ks and bias ponding toward areas of very low Ks during rainfall simulations and small catchment runoff events while areas with high preferential flow capacity remain water supply-limited more than the conceptual model of Ke predicts. The study highlights problems in the current theory of scaling runoff generation.
Virtual Volatility, an Elementary New Concept with Surprising Stock Market Consequences
NASA Astrophysics Data System (ADS)
Prange, Richard; Silva, A. Christian
2006-03-01
Textbook investors start by predicting the future price distribution, PDF, of a candidate stock (or portfolio) at horizon T, e.g. a year hence. A (log)normal PDF with center (=drift =expected return) μT and width (=volatility) σT is often assumed on Central Limit Theorem grounds, i.e. by a random walk of daily (log)price increments δs. The standard deviation, stdev, of historical (ex post) δs `s is usually a fair predictor of the coming year's (ex ante) stdev(δs) = σdaily, but the historical mean E(δs) at best roughly limits the true, to be predicted, drift by μtrueT˜ μhistT ± σhistT. Textbooks take a PDF with σ ˜ σdaily and μ as somehow known, as if accurate predictions of μ were possible. It is elementary and presumably new to argue that an average of PDF's over a range of μ values should be taken, e.g. an average over forecasts by different analysts. We estimate that this leads to a PDF with a `virtual' volatility σ ˜ 1.3σdaily. It is indeed clear that uncertainty in the value of the expected gain parameter increases the risk of investment in that security by most measures, e. g. Sharpe's ratio μT/σT will be 30% smaller because of this effect. It is significant and surprising that there are investments which benefit from this 30% virtual increase in the volatility
THE DEPENDENCE OF PRESTELLAR CORE MASS DISTRIBUTIONS ON THE STRUCTURE OF THE PARENTAL CLOUD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parravano, Antonio; Sanchez, Nestor; Alfaro, Emilio J.
2012-08-01
The mass distribution of prestellar cores is obtained for clouds with arbitrary internal mass distributions using a selection criterion based on the thermal and turbulent Jeans mass and applied hierarchically from small to large scales. We have checked this methodology by comparing our results for a log-normal density probability distribution function with the theoretical core mass function (CMF) derived by Hennebelle and Chabrier, namely a power law at large scales and a log-normal cutoff at low scales, but our method can be applied to any mass distributions representing a star-forming cloud. This methodology enables us to connect the parental cloudmore » structure with the mass distribution of the cores and their spatial distribution, providing an efficient tool for investigating the physical properties of the molecular clouds that give rise to the prestellar core distributions observed. Simulated fractional Brownian motion (fBm) clouds with the Hurst exponent close to the value H = 1/3 give the best agreement with the theoretical CMF derived by Hennebelle and Chabrier and Chabrier's system initial mass function. Likewise, the spatial distribution of the cores derived from our methodology shows a surface density of companions compatible with those observed in Trapezium and Ophiucus star-forming regions. This method also allows us to analyze the properties of the mass distribution of cores for different realizations. We found that the variations in the number of cores formed in different realizations of fBm clouds (with the same Hurst exponent) are much larger than the expected root N statistical fluctuations, increasing with H.« less
Zheng, Xiliang; Wang, Jin
2015-01-01
We uncovered the universal statistical laws for the biomolecular recognition/binding process. We quantified the statistical energy landscapes for binding, from which we can characterize the distributions of the binding free energy (affinity), the equilibrium constants, the kinetics and the specificity by exploring the different ligands binding with a particular receptor. The results of the analytical studies are confirmed by the microscopic flexible docking simulations. The distribution of binding affinity is Gaussian around the mean and becomes exponential near the tail. The equilibrium constants of the binding follow a log-normal distribution around the mean and a power law distribution in the tail. The intrinsic specificity for biomolecular recognition measures the degree of discrimination of native versus non-native binding and the optimization of which becomes the maximization of the ratio of the free energy gap between the native state and the average of non-native states versus the roughness measured by the variance of the free energy landscape around its mean. The intrinsic specificity obeys a Gaussian distribution near the mean and an exponential distribution near the tail. Furthermore, the kinetics of binding follows a log-normal distribution near the mean and a power law distribution at the tail. Our study provides new insights into the statistical nature of thermodynamics, kinetics and function from different ligands binding with a specific receptor or equivalently specific ligand binding with different receptors. The elucidation of distributions of the kinetics and free energy has guiding roles in studying biomolecular recognition and function through small-molecule evolution and chemical genetics. PMID:25885453
The Dependence of Prestellar Core Mass Distributions on the Structure of the Parental Cloud
NASA Astrophysics Data System (ADS)
Parravano, Antonio; Sánchez, Néstor; Alfaro, Emilio J.
2012-08-01
The mass distribution of prestellar cores is obtained for clouds with arbitrary internal mass distributions using a selection criterion based on the thermal and turbulent Jeans mass and applied hierarchically from small to large scales. We have checked this methodology by comparing our results for a log-normal density probability distribution function with the theoretical core mass function (CMF) derived by Hennebelle & Chabrier, namely a power law at large scales and a log-normal cutoff at low scales, but our method can be applied to any mass distributions representing a star-forming cloud. This methodology enables us to connect the parental cloud structure with the mass distribution of the cores and their spatial distribution, providing an efficient tool for investigating the physical properties of the molecular clouds that give rise to the prestellar core distributions observed. Simulated fractional Brownian motion (fBm) clouds with the Hurst exponent close to the value H = 1/3 give the best agreement with the theoretical CMF derived by Hennebelle & Chabrier and Chabrier's system initial mass function. Likewise, the spatial distribution of the cores derived from our methodology shows a surface density of companions compatible with those observed in Trapezium and Ophiucus star-forming regions. This method also allows us to analyze the properties of the mass distribution of cores for different realizations. We found that the variations in the number of cores formed in different realizations of fBm clouds (with the same Hurst exponent) are much larger than the expected root {\\cal N} statistical fluctuations, increasing with H.
Karulin, Alexey Y.; Karacsony, Kinga; Zhang, Wenji; Targoni, Oleg S.; Moldovan, Ioana; Dittrich, Marcus; Sundararaman, Srividya; Lehmann, Paul V.
2015-01-01
Each positive well in ELISPOT assays contains spots of variable sizes that can range from tens of micrometers up to a millimeter in diameter. Therefore, when it comes to counting these spots the decision on setting the lower and the upper spot size thresholds to discriminate between non-specific background noise, spots produced by individual T cells, and spots formed by T cell clusters is critical. If the spot sizes follow a known statistical distribution, precise predictions on minimal and maximal spot sizes, belonging to a given T cell population, can be made. We studied the size distributional properties of IFN-γ, IL-2, IL-4, IL-5 and IL-17 spots elicited in ELISPOT assays with PBMC from 172 healthy donors, upon stimulation with 32 individual viral peptides representing defined HLA Class I-restricted epitopes for CD8 cells, and with protein antigens of CMV and EBV activating CD4 cells. A total of 334 CD8 and 80 CD4 positive T cell responses were analyzed. In 99.7% of the test cases, spot size distributions followed Log Normal function. These data formally demonstrate that it is possible to establish objective, statistically validated parameters for counting T cell ELISPOTs. PMID:25612115
Kay, Robert T.; Mills, Patrick C.; Dunning, Charles P.; Yeskis, Douglas J.; Ursic, James R.; Vendl, Mark
2004-01-01
The effectiveness of 28 methods used to characterize the fractured Galena-Platteville aquifer at eight sites in northern Illinois and Wisconsin is evaluated. Analysis of government databases, previous investigations, topographic maps, aerial photographs, and outcrops was essential to understanding the hydrogeology in the area to be investigated. The effectiveness of surface-geophysical methods depended on site geology. Lithologic logging provided essential information for site characterization. Cores were used for stratigraphy and geotechnical analysis. Natural-gamma logging helped identify the effect of lithology on the location of secondary- permeability features. Caliper logging identified large secondary-permeability features. Neutron logs identified trends in matrix porosity. Acoustic-televiewer logs identified numerous secondary-permeability features and their orientation. Borehole-camera logs also identified a number of secondary-permeability features. Borehole ground-penetrating radar identified lithologic and secondary-permeability features. However, the accuracy and completeness of this method is uncertain. Single-point-resistance, density, and normal resistivity logs were of limited use. Water-level and water-quality data identified flow directions and indicated the horizontal and vertical distribution of aquifer permeability and the depth of the permeable features. Temperature, spontaneous potential, and fluid-resistivity logging identified few secondary-permeability features at some sites and several features at others. Flowmeter logging was the most effective geophysical method for characterizing secondary-permeability features. Aquifer tests provided insight into the permeability distribution, identified hydraulically interconnected features, the presence of heterogeneity and anisotropy, and determined effective porosity. Aquifer heterogeneity prevented calculation of accurate hydraulic properties from some tests. Different methods, such as flowmeter logging and slug testing, occasionally produced different interpretations. Aquifer characterization improved with an increase in the number of data points, the period of data collection, and the number of methods used.
Universal noise and Efimov physics
NASA Astrophysics Data System (ADS)
Nicholson, Amy N.
2016-03-01
Probability distributions for correlation functions of particles interacting via random-valued fields are discussed as a novel tool for determining the spectrum of a theory. In particular, this method is used to determine the energies of universal N-body clusters tied to Efimov trimers, for even N, by investigating the distribution of a correlation function of two particles at unitarity. Using numerical evidence that this distribution is log-normal, an analytical prediction for the N-dependence of the N-body binding energies is made.
2015-06-17
progress, Eq. (4) is evaluated in terms of the differential entropy h. The integrals can be identified as differential entropy terms by expanding the log...all ran- dom vectors p with a given covariance matrix, the entropy of p is maximized when p is ZMCSCG since a normal distribution maximizes the... entropy over all distributions with the same covariance [9, 18], implying that this is the optimal distribution on s as well. In addition, of all the
Far-ultraviolet energy distributions of the metal-poor A stars HD 109995 and HD 161817
NASA Technical Reports Server (NTRS)
Boehm-Vitense, E.
1981-01-01
Low-resolution IUE spectra at wavelengths between 1300 and 3400 A of the metal-poor stars HD 109995 (A1p) and HD 161817 (A4p) have been compared with model-atmosphere energy distributions computed by Kurucz (1979). Good overall agreement is found. Effective temperatures, metal abundances, and angular diameters could be determined. Assuming an absolute visual magnitude of 0.7, the previously determined gravity log = 3 yields masses of 0.5 solar masses for both stars. It is found that the theoretical UBV colors calculated earlier agree reaonably well with the ones observed for these stars.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blandino, Rémi; Etesse, Jean; Grangier, Philippe
2014-12-04
We show that the maximum transmission distance of continuous-variable quantum key distribution in presence of a Gaussian noisy lossy channel can be arbitrarily increased using a heralded noiseless linear amplifier. We explicitly consider a protocol using amplitude and phase modulated coherent states with reverse reconciliation. Assuming that the secret key rate drops to zero for a line transmittance T{sub lim}, we find that a noiseless amplifier with amplitude gain g can improve this value to T{sub lim}/g{sup 2}, corresponding to an increase in distance proportional to log g. We also show that the tolerance against noise is increased.
Assessment of Methane Emissions from Oil and Gas Production Pads using Mobile Measurements
Journal Article Abstract --- "A mobile source inspection approach called OTM 33A was used to quantify short-term methane emission rates from 218 oil and gas production pads in Texas, Colorado, and Wyoming from 2010 to 2013. The emission rates were log-normally distributed with ...
Evaluation of waste mushroom logs as a potential biomass resource for the production of bioethanol.
Lee, Jae-Won; Koo, Bon-Wook; Choi, Joon-Weon; Choi, Don-Ha; Choi, In-Gyu
2008-05-01
In order to investigate the possibility of using waste mushroom logs as a biomass resource for alternative energy production, the chemical and physical characteristics of normal wood and waste mushroom logs were examined. Size reduction of normal wood (145 kW h/tone) required significantly higher energy consumption than waste mushroom logs (70 kW h/tone). The crystallinity value of waste mushroom logs was dramatically lower (33%) than normal wood (49%) after cultivation by Lentinus edodes as spawn. Lignin, an enzymatic hydrolysis inhibitor in sugar production, decreased from 21.07% to 18.78% after inoculation of L. edodes. Total sugar yields obtained by enzyme and acid hydrolysis were higher in waste mushroom logs than in normal wood. After 24h fermentation, 12 g/L ethanol was produced on waste mushroom logs, while normal wood produced 8 g/L ethanol. These results indicate that waste mushroom logs are economically suitable lignocellulosic material for the production of fermentable sugars related to bioethanol production.
Modelling of PM10 concentration for industrialized area in Malaysia: A case study in Shah Alam
NASA Astrophysics Data System (ADS)
N, Norazian Mohamed; Abdullah, M. M. A.; Tan, Cheng-yau; Ramli, N. A.; Yahaya, A. S.; Fitri, N. F. M. Y.
In Malaysia, the predominant air pollutants are suspended particulate matter (SPM) and nitrogen dioxide (NO2). This research is on PM10 as they may trigger harm to human health as well as environment. Six distributions, namely Weibull, log-normal, gamma, Rayleigh, Gumbel and Frechet were chosen to model the PM10 observations at the chosen industrial area i.e. Shah Alam. One-year period hourly average data for 2006 and 2007 were used for this research. For parameters estimation, method of maximum likelihood estimation (MLE) was selected. Four performance indicators that are mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R2) and prediction accuracy (PA), were applied to determine the goodness-of-fit criteria of the distributions. The best distribution that fits with the PM10 observations in Shah Alamwas found to be log-normal distribution. The probabilities of the exceedences concentration were calculated and the return period for the coming year was predicted from the cumulative density function (cdf) obtained from the best-fit distributions. For the 2006 data, Shah Alam was predicted to exceed 150 μg/m3 for 5.9 days in 2007 with a return period of one occurrence per 62 days. For 2007, the studied area does not exceed the MAAQG of 150 μg/m3
Recent Upgrades to the NASA Ames Mars General Circulation Model: Applications to Mars' Water Cycle
NASA Astrophysics Data System (ADS)
Hollingsworth, Jeffery L.; Kahre, M. A.; Haberle, R. M.; Montmessin, F.; Wilson, R. J.; Schaeffer, J.
2008-09-01
We report on recent improvements to the NASA Ames Mars general circulation model (GCM), a robust 3D climate-modeling tool that is state-of-the-art in terms of its physics parameterizations and subgrid-scale processes, and which can be applied to investigate physical and dynamical processes of the present (and past) Mars climate system. The most recent version (gcm2.1, v.24) of the Ames Mars GCM utilizes a more generalized radiation code (based on a two-stream approximation with correlated k's); an updated transport scheme (van Leer formulation); a cloud microphysics scheme that assumes a log-normal particle size distribution whose first two moments are treated as atmospheric tracers, and which includes the nucleation, growth and sedimentation of ice crystals. Atmospheric aerosols (e.g., dust and water-ice) can either be radiatively active or inactive. We apply this version of the Ames GCM to investigate key aspects of the present water cycle on Mars. Atmospheric dust is partially interactive in our simulations; namely, the radiation code "sees" a prescribed distribution that follows the MGS thermal emission spectrometer (TES) year-one measurements with a self-consistent vertical depth scale that varies with season. The cloud microphysics code interacts with a transported dust tracer column whose surface source is adjusted to maintain the TES distribution. The model is run from an initially dry state with a better representation of the north residual cap (NRC) which accounts for both surface-ice and bare-soil components. A seasonally repeatable water cycle is obtained within five Mars years. Our sub-grid scale representation of the NRC provides for a more realistic flux of moisture to the atmosphere and a much drier water cycle consistent with recent spacecraft observations (e.g., Mars Express PFS, corrected MGS/TES) compared to models that assume a spatially uniform and homogeneous north residual polar cap.
A hierarchical Bayesian GEV model for improving local and regional flood quantile estimates
NASA Astrophysics Data System (ADS)
Lima, Carlos H. R.; Lall, Upmanu; Troy, Tara; Devineni, Naresh
2016-10-01
We estimate local and regional Generalized Extreme Value (GEV) distribution parameters for flood frequency analysis in a multilevel, hierarchical Bayesian framework, to explicitly model and reduce uncertainties. As prior information for the model, we assume that the GEV location and scale parameters for each site come from independent log-normal distributions, whose mean parameter scales with the drainage area. From empirical and theoretical arguments, the shape parameter for each site is shrunk towards a common mean. Non-informative prior distributions are assumed for the hyperparameters and the MCMC method is used to sample from the joint posterior distribution. The model is tested using annual maximum series from 20 streamflow gauges located in an 83,000 km2 flood prone basin in Southeast Brazil. The results show a significant reduction of uncertainty estimates of flood quantile estimates over the traditional GEV model, particularly for sites with shorter records. For return periods within the range of the data (around 50 years), the Bayesian credible intervals for the flood quantiles tend to be narrower than the classical confidence limits based on the delta method. As the return period increases beyond the range of the data, the confidence limits from the delta method become unreliable and the Bayesian credible intervals provide a way to estimate satisfactory confidence bands for the flood quantiles considering parameter uncertainties and regional information. In order to evaluate the applicability of the proposed hierarchical Bayesian model for regional flood frequency analysis, we estimate flood quantiles for three randomly chosen out-of-sample sites and compare with classical estimates using the index flood method. The posterior distributions of the scaling law coefficients are used to define the predictive distributions of the GEV location and scale parameters for the out-of-sample sites given only their drainage areas and the posterior distribution of the average shape parameter is taken as the regional predictive distribution for this parameter. While the index flood method does not provide a straightforward way to consider the uncertainties in the index flood and in the regional parameters, the results obtained here show that the proposed Bayesian method is able to produce adequate credible intervals for flood quantiles that are in accordance with empirical estimates.
Preliminary analysis of hot spot factors in an advanced reactor for space electric power systems
NASA Technical Reports Server (NTRS)
Lustig, P. H.; Holms, A. G.; Davison, H. W.
1973-01-01
The maximum fuel pin temperature for nominal operation in an advanced power reactor is 1370 K. Because of possible nitrogen embrittlement of the clad, the fuel temperature was limited to 1622 K. Assuming simultaneous occurrence of the most adverse conditions a deterministic analysis gave a maximum fuel temperature of 1610 K. A statistical analysis, using a synthesized estimate of the standard deviation for the highest fuel pin temperature, showed probabilities of 0.015 of that pin exceeding the temperature limit by the distribution free Chebyshev inequality and virtually nil assuming a normal distribution. The latter assumption gives a 1463 K maximum temperature at 3 standard deviations, the usually assumed cutoff. Further, the distribution and standard deviation of the fuel-clad gap are the most significant contributions to the uncertainty in the fuel temperature.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
A log-normal distribution model for the molecular weight of aquatic fulvic acids
Cabaniss, S.E.; Zhou, Q.; Maurice, P.A.; Chin, Y.-P.; Aiken, G.R.
2000-01-01
The molecular weight of humic substances influences their proton and metal binding, organic pollutant partitioning, adsorption onto minerals and activated carbon, and behavior during water treatment. We propose a lognormal model for the molecular weight distribution in aquatic fulvic acids to provide a conceptual framework for studying these size effects. The normal curve mean and standard deviation are readily calculated from measured M(n) and M(w) and vary from 2.7 to 3 for the means and from 0.28 to 0.37 for the standard deviations for typical aquatic fulvic acids. The model is consistent with several types of molecular weight data, including the shapes of high- pressure size-exclusion chromatography (HP-SEC) peaks. Applications of the model to electrostatic interactions, pollutant solubilization, and adsorption are explored in illustrative calculations.The molecular weight of humic substances influences their proton and metal binding, organic pollutant partitioning, adsorption onto minerals and activated carbon, and behavior during water treatment. We propose a log-normal model for the molecular weight distribution in aquatic fulvic acids to provide a conceptual framework for studying these size effects. The normal curve mean and standard deviation are readily calculated from measured Mn and Mw and vary from 2.7 to 3 for the means and from 0.28 to 0.37 for the standard deviations for typical aquatic fulvic acids. The model is consistent with several type's of molecular weight data, including the shapes of high-pressure size-exclusion chromatography (HP-SEC) peaks. Applications of the model to electrostatic interactions, pollutant solubilization, and adsorption are explored in illustrative calculations.
Schwantes-An, Tae-Hwi; Sung, Heejong; Sabourin, Jeremy A; Justice, Cristina M; Sorant, Alexa J M; Wilson, Alexander F
2016-01-01
In this study, the effects of (a) the minor allele frequency of the single nucleotide variant (SNV), (b) the degree of departure from normality of the trait, and (c) the position of the SNVs on type I error rates were investigated in the Genetic Analysis Workshop (GAW) 19 whole exome sequence data. To test the distribution of the type I error rate, 5 simulated traits were considered: standard normal and gamma distributed traits; 2 transformed versions of the gamma trait (log 10 and rank-based inverse normal transformations); and trait Q1 provided by GAW 19. Each trait was tested with 313,340 SNVs. Tests of association were performed with simple linear regression and average type I error rates were determined for minor allele frequency classes. Rare SNVs (minor allele frequency < 0.05) showed inflated type I error rates for non-normally distributed traits that increased as the minor allele frequency decreased. The inflation of average type I error rates increased as the significance threshold decreased. Normally distributed traits did not show inflated type I error rates with respect to the minor allele frequency for rare SNVs. There was no consistent effect of transformation on the uniformity of the distribution of the location of SNVs with a type I error.
ERIC Educational Resources Information Center
Xu, Xueli; Jia, Yue
2011-01-01
Estimation of item response model parameters and ability distribution parameters has been, and will remain, an important topic in the educational testing field. Much research has been dedicated to addressing this task. Some studies have focused on item parameter estimation when the latent ability was assumed to follow a normal distribution,…
ERIC Educational Resources Information Center
Goldhaber, Dan; Startz, Richard
2016-01-01
It is common to assume that worker productivity is normally distributed, but this assumption is rarely if ever tested. We estimate the distribution of worker productivity where individual productivity is measured with error, using the productivity of elementary school teachers as an example. Proposals to improve teacher productivity often focus on…
Role of Demographic Dynamics and Conflict in the Population-Area Relationship for Human Languages
Manrubia, Susanna C.; Axelsen, Jacob B.; Zanette, Damián H.
2012-01-01
Many patterns displayed by the distribution of human linguistic groups are similar to the ecological organization described for biological species. It remains a challenge to identify simple and meaningful processes that describe these patterns. The population size distribution of human linguistic groups, for example, is well fitted by a log-normal distribution that may arise from stochastic demographic processes. As we show in this contribution, the distribution of the area size of home ranges of those groups also agrees with a log-normal function. Further, size and area are significantly correlated: the number of speakers and the area spanned by linguistic groups follow the allometric relation , with an exponent varying accross different world regions. The empirical evidence presented leads to the hypothesis that the distributions of and , and their mutual dependence, rely on demographic dynamics and on the result of conflicts over territory due to group growth. To substantiate this point, we introduce a two-variable stochastic multiplicative model whose analytical solution recovers the empirical observations. Applied to different world regions, the model reveals that the retreat in home range is sublinear with respect to the decrease in population size, and that the population-area exponent grows with the typical strength of conflicts. While the shape of the population size and area distributions, and their allometric relation, seem unavoidable outcomes of demography and inter-group contact, the precise value of could give insight on the cultural organization of those human groups in the last thousand years. PMID:22815726
Fatigue Shifts and Scatters Heart Rate Variability in Elite Endurance Athletes
Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire
2013-01-01
Purpose This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in ‘fatigue’ or in ‘no-fatigue’ state in ‘real life’ conditions. Methods 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms2 and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). Results 172 trials were identified as in a ‘fatigue’ and 891 as in ‘no-fatigue’ state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between ‘fatigue’ and ‘no-fatigue’: HRSU (+6.27±0.61 bpm), logTPSU (−0.36±0.04), logLFSU (−0.27±0.04), logHFSU (−0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (−9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (−0.28±0.03), logLFST (−0.29±0.03), logHFST (−0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the ‘fatigue’ state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). Conclusion HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern. PMID:23951198
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christensen, Gary E.; Song, Joo Hyun; Lu, Wei
2007-06-15
Breathing motion is one of the major limiting factors for reducing dose and irradiation of normal tissue for conventional conformal radiotherapy. This paper describes a relationship between tracking lung motion using spirometry data and image registration of consecutive CT image volumes collected from a multislice CT scanner over multiple breathing periods. Temporal CT sequences from 5 individuals were analyzed in this study. The couch was moved from 11 to 14 different positions to image the entire lung. At each couch position, 15 image volumes were collected over approximately 3 breathing periods. It is assumed that the expansion and contraction ofmore » lung tissue can be modeled as an elastic material. Furthermore, it is assumed that the deformation of the lung is small over one-fifth of a breathing period and therefore the motion of the lung can be adequately modeled using a small deformation linear elastic model. The small deformation inverse consistent linear elastic image registration algorithm is therefore well suited for this problem and was used to register consecutive image scans. The pointwise expansion and compression of lung tissue was measured by computing the Jacobian of the transformations used to register the images. The logarithm of the Jacobian was computed so that expansion and compression of the lung were scaled equally. The log-Jacobian was computed at each voxel in the volume to produce a map of the local expansion and compression of the lung during the breathing period. These log-Jacobian images demonstrate that the lung does not expand uniformly during the breathing period, but rather expands and contracts locally at different rates during inhalation and exhalation. The log-Jacobian numbers were averaged over a cross section of the lung to produce an estimate of the average expansion or compression from one time point to the next and compared to the air flow rate measured by spirometry. In four out of five individuals, the average log-Jacobian value and the air flow rate correlated well (R{sup 2}=0.858 on average for the entire lung). The correlation for the fifth individual was not as good (R{sup 2}=0.377 on average for the entire lung) and can be explained by the small variation in tidal volume for this individual. The correlation of the average log-Jacobian value and the air flow rate for images near the diaphragm correlated well in all five individuals (R{sup 2}=0.943 on average). These preliminary results indicate a strong correlation between the expansion/compression of the lung measured by image registration and the air flow rate measured by spirometry. Predicting the location, motion, and compression/expansion of the tumor and normal tissue using image registration and spirometry could have many important benefits for radiotherapy treatment. These benefits include reducing radiation dose to normal tissue, maximizing dose to the tumor, improving patient care, reducing treatment cost, and increasing patient throughput.« less
Christensen, Gary E; Song, Joo Hyun; Lu, Wei; El Naqa, Issam; Low, Daniel A
2007-06-01
Breathing motion is one of the major limiting factors for reducing dose and irradiation of normal tissue for conventional conformal radiotherapy. This paper describes a relationship between tracking lung motion using spirometry data and image registration of consecutive CT image volumes collected from a multislice CT scanner over multiple breathing periods. Temporal CT sequences from 5 individuals were analyzed in this study. The couch was moved from 11 to 14 different positions to image the entire lung. At each couch position, 15 image volumes were collected over approximately 3 breathing periods. It is assumed that the expansion and contraction of lung tissue can be modeled as an elastic material. Furthermore, it is assumed that the deformation of the lung is small over one-fifth of a breathing period and therefore the motion of the lung can be adequately modeled using a small deformation linear elastic model. The small deformation inverse consistent linear elastic image registration algorithm is therefore well suited for this problem and was used to register consecutive image scans. The pointwise expansion and compression of lung tissue was measured by computing the Jacobian of the transformations used to register the images. The logarithm of the Jacobian was computed so that expansion and compression of the lung were scaled equally. The log-Jacobian was computed at each voxel in the volume to produce a map of the local expansion and compression of the lung during the breathing period. These log-Jacobian images demonstrate that the lung does not expand uniformly during the breathing period, but rather expands and contracts locally at different rates during inhalation and exhalation. The log-Jacobian numbers were averaged over a cross section of the lung to produce an estimate of the average expansion or compression from one time point to the next and compared to the air flow rate measured by spirometry. In four out of five individuals, the average log-Jacobian value and the air flow rate correlated well (R2 = 0.858 on average for the entire lung). The correlation for the fifth individual was not as good (R2 = 0.377 on average for the entire lung) and can be explained by the small variation in tidal volume for this individual. The correlation of the average log-Jacobian value and the air flow rate for images near the diaphragm correlated well in all five individuals (R2 = 0.943 on average). These preliminary results indicate a strong correlation between the expansion/compression of the lung measured by image registration and the air flow rate measured by spirometry. Predicting the location, motion, and compression/expansion of the tumor and normal tissue using image registration and spirometry could have many important benefits for radiotherapy treatment. These benefits include reducing radiation dose to normal tissue, maximizing dose to the tumor, improving patient care, reducing treatment cost, and increasing patient throughput.
NASA Astrophysics Data System (ADS)
Lonsdale, Carol J.; Lacy, M.; Kimball, A. E.; Blain, A.; Whittle, M.; Wilkes, B.; Stern, D.; Condon, J.; Kim, M.; Assef, R. J.; Tsai, C.-W.; Efstathiou, A.; Jones, S.; Eisenhardt, P.; Bridge, C.; Wu, J.; Lonsdale, Colin J.; Jones, K.; Jarrett, T.; Smith, R.
2015-11-01
We present Atacama Large Millimeter/submillimeter Array (ALMA) 870 μm (345 GHz) data for 49 high-redshift (0.47 < z < 2.85), luminous (11.7\\lt {log}({L}{{bol}}/{L}⊙ )\\lt 14.2) radio-powerful active galactic nuclei (AGNs), obtained to constrain cool dust emission from starbursts concurrent with highly obscured radiative-mode black hole (BH) accretion in massive galaxies that possess a small radio jet. The sample was selected from the Wide-field Infrared Survey Explorer with extremely steep (red) mid-infrared colors and with compact radio emission from NVSS/FIRST. Twenty-six sources are detected at 870 μm, and we find that the sample has large mid- to far-infrared luminosity ratios, consistent with a dominant and highly obscured quasar. The rest-frame 3 GHz radio powers are 24.7\\lt {log}({P}\\text{3.0 GHz}/{{{W}} {Hz}}-1)\\lt 27.3, and all sources are radio-intermediate or radio-loud. BH mass estimates are 7.7 < log(MBH/M⊙) < 10.2. The rest-frame 1-5 μm spectral energy distributions are very similar to the “Hot DOGs” (hot dust-obscured galaxies), and steeper (redder) than almost any other known extragalactic sources. ISM masses estimated for the ALMA-detected sources are 9.9 < log (MISM/M⊙) < 11.75 assuming a dust temperature of 30 K. The cool dust emission is consistent with star formation rates reaching several thousand M⊙ yr-1, depending on the assumed dust temperature, but we cannot rule out the alternative that the AGN powers all the emission in some cases. Our best constrained source has radiative transfer solutions with approximately equal contributions from an obscured AGN and a young (10-15 Myr) compact starburst.
Specializing network analysis to detect anomalous insider actions
Chen, You; Nyemba, Steve; Zhang, Wen; Malin, Bradley
2012-01-01
Collaborative information systems (CIS) enable users to coordinate efficiently over shared tasks in complex distributed environments. For flexibility, they provide users with broad access privileges, which, as a side-effect, leave such systems vulnerable to various attacks. Some of the more damaging malicious activities stem from internal misuse, where users are authorized to access system resources. A promising class of insider threat detection models for CIS focuses on mining access patterns from audit logs, however, current models are limited in that they assume organizations have significant resources to generate label cases for training classifiers or assume the user has committed a large number of actions that deviate from “normal” behavior. In lieu of the previous assumptions, we introduce an approach that detects when specific actions of an insider deviate from expectation in the context of collaborative behavior. Specifically, in this paper, we introduce a specialized network anomaly detection model, or SNAD, to detect such events. This approach assesses the extent to which a user influences the similarity of the group of users that access a particular record in the CIS. From a theoretical perspective, we show that the proposed model is appropriate for detecting insider actions in dynamic collaborative systems. From an empirical perspective, we perform an extensive evaluation of SNAD with the access logs of two distinct environments: the patient record access logs a large electronic health record system (6,015 users, 130,457 patients and 1,327,500 accesses) and the editing logs of Wikipedia (2,394,385 revisors, 55,200 articles and 6,482,780 revisions). We compare our model with several competing methods and demonstrate SNAD is significantly more effective: on average it achieves 20–30% greater area under an ROC curve. PMID:23399988
A new stochastic algorithm for inversion of dust aerosol size distribution
NASA Astrophysics Data System (ADS)
Wang, Li; Li, Feng; Yang, Ma-ying
2015-08-01
Dust aerosol size distribution is an important source of information about atmospheric aerosols, and it can be determined from multiwavelength extinction measurements. This paper describes a stochastic inverse technique based on artificial bee colony (ABC) algorithm to invert the dust aerosol size distribution by light extinction method. The direct problems for the size distribution of water drop and dust particle, which are the main elements of atmospheric aerosols, are solved by the Mie theory and the Lambert-Beer Law in multispectral region. And then, the parameters of three widely used functions, i.e. the log normal distribution (L-N), the Junge distribution (J-J), and the normal distribution (N-N), which can provide the most useful representation of aerosol size distributions, are inversed by the ABC algorithm in the dependent model. Numerical results show that the ABC algorithm can be successfully applied to recover the aerosol size distribution with high feasibility and reliability even in the presence of random noise.
NASA Astrophysics Data System (ADS)
Marrufo-Hernández, Norma Alejandra; Hernández-Guerrero, Maribel; Nápoles-Duarte, José Manuel; Palomares-Báez, Juan Pedro; Chávez-Rojo, Marco Antonio
2018-03-01
We present a computational model that describes the diffusion of a hard spheres colloidal fluid through a membrane. The membrane matrix is modeled as a series of flat parallel planes with circular pores of different sizes and random spatial distribution. This model was employed to determine how the size distribution of the colloidal filtrate depends on the size distributions of both, the particles in the feed and the pores of the membrane, as well as to describe the filtration kinetics. A Brownian dynamics simulation study considering normal distributions was developed in order to determine empirical correlations between the parameters that characterize these distributions. The model can also be extended to other distributions such as log-normal. This study could, therefore, facilitate the selection of membranes for industrial or scientific filtration processes once the size distribution of the feed is known and the expected characteristics in the filtrate have been defined.
Code of Federal Regulations, 2010 CFR
2010-01-01
... load factor must be assumed to act normal to the longitudinal axis of the rotorcraft, and to be equal... from the design minimum weight to the design maximum weight; and (2) With any practical distribution of...
Mathematical Model of Naive T Cell Division and Survival IL-7 Thresholds.
Reynolds, Joseph; Coles, Mark; Lythe, Grant; Molina-París, Carmen
2013-01-01
We develop a mathematical model of the peripheral naive T cell population to study the change in human naive T cell numbers from birth to adulthood, incorporating thymic output and the availability of interleukin-7 (IL-7). The model is formulated as three ordinary differential equations: two describe T cell numbers, in a resting state and progressing through the cell cycle. The third is introduced to describe changes in IL-7 availability. Thymic output is a decreasing function of time, representative of the thymic atrophy observed in aging humans. Each T cell is assumed to possess two interleukin-7 receptor (IL-7R) signaling thresholds: a survival threshold and a second, higher, proliferation threshold. If the IL-7R signaling strength is below its survival threshold, a cell may undergo apoptosis. When the signaling strength is above the survival threshold, but below the proliferation threshold, the cell survives but does not divide. Signaling strength above the proliferation threshold enables entry into cell cycle. Assuming that individual cell thresholds are log-normally distributed, we derive population-average rates for apoptosis and entry into cell cycle. We have analyzed the adiabatic change in homeostasis as thymic output decreases. With a parameter set representative of a healthy individual, the model predicts a unique equilibrium number of T cells. In a parameter range representative of persistent viral or bacterial infection, where naive T cell cycle progression is impaired, a decrease in thymic output may result in the collapse of the naive T cell repertoire.
Computer routines for probability distributions, random numbers, and related functions
Kirby, W.
1983-01-01
Use of previously coded and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main progress. The probability distributions provided include the beta, chi-square, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F. Other mathematical functions include the Bessel function, I sub o, gamma and log-gamma functions, error functions, and exponential integral. Auxiliary services include sorting and printer-plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)
Computer routines for probability distributions, random numbers, and related functions
Kirby, W.H.
1980-01-01
Use of previously codes and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main programs. The probability distributions provided include the beta, chisquare, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F tests. Other mathematical functions include the Bessel function I (subzero), gamma and log-gamma functions, error functions and exponential integral. Auxiliary services include sorting and printer plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)
A comparative study of different PGA attenuation and error models: Case of 1999 Chi-Chi earthquake
NASA Astrophysics Data System (ADS)
Mebarki, Ahmed
2009-03-01
In order to evaluate the horizontal peak ground acceleration (HPGA) during earthquakes, the author studies the respective efficiency of two existing attenuation models [Mébarki, A., 2003a, Risques sismiques: aléas, vulnérabilité et aide à la décision par cartes SIG. Proceedings of International Conference on "Risks, Vulnerability and Reliability in Construction. Towards a reduction of disasters". ISBN: 9961-891-01-5, pp. 82-97. Algiers, October 11-12, Mébarki, A., 2003b. Proposal of a parametric attenuation model and comparison with some worldwide earthquakes. VII o Congreso Venezolano de Sismologia y Ingenieria Sísmica, Barquisimeto, Venezuela. November 12-13, (CD-ROM), Mébarki A., 2004. Modèle d'atténuation sismique: prédiction probabiliste des pics d'accélération, RFGC — Revue Française de Génie Civil, Hermès Ed., 8 (9-10), 1071-1086]. A comparative study of their performances is done in the case of 1999 Chi-Chi earthquake (Taiwan). The reported PGA (Peak Ground Accelerations) values correspond to hypocentral distances ranging from 15 up to 180 km with observed acceleration peaks ranging from (0.04 g) up to (1.16 g). The author considers two kinds of probabilistic distributions for the error model in order to describe the uncertainty and the variability that affect the values of the PGA: a Gamma distribution and a Log-normal distribution. The adopted error models assume that the variability of the PGA is such that its coefficient of variation is equal to 55% [Mébarki, A., 2003a. Risques sismiques: aléas, vulnérabilité et aide à la décision par cartes SIG. Proceedings of International Conference on "Risks, Vulnerability and Reliability in Construction. Towards a reduction of disasters". ISBN: 9961-891-01-5, pp. 82-97. Algiers, October 11-12, Mébarki, A., 2003b. Proposal of a parametric attenuation model and comparison with some worldwide earthquakes. VII o Congreso Venezolano de Sismologia y Ingenieria Sísmica, Barquisimeto, Venezuela. November 12-13, (CD-ROM), Mébarki, A., 2004. Modèle d'atténuation sismique: prédiction probabiliste des pics d'accélération, RFGC — Revue Française de Génie Civil, Hermès Ed., 8 (9-10), 1071-1086]. The obtained results show that both attenuation models provide adequate values of the PGA when a Gamma distribution is adopted for the error model. The attenuation models efficiency decreases slightly when a Log-normal distribution is adopted for the error model.
NASA Astrophysics Data System (ADS)
Soriano-Hernández, P.; del Castillo-Mussot, M.; Campirán-Chávez, I.; Montemayor-Aldrete, J. A.
2017-04-01
Forbes Magazine published its list of leading or strongest publicly-traded two thousand companies in the world (G-2000) based on four independent metrics: sales or revenues, profits, assets and market value. Every one of these wealth metrics yields particular information on the corporate size or wealth size of each firm. The G-2000 cumulative probability wealth distribution per employee (per capita) for all four metrics exhibits a two-class structure: quasi-exponential in the lower part, and a Pareto power-law in the higher part. These two-class structure per capita distributions are qualitatively similar to income and wealth distributions in many countries of the world, but the fraction of firms per employee within the high-class Pareto is about 49% in sales per employee, and 33% after averaging on the four metrics, whereas in countries the fraction of rich agents in the Pareto zone is less than 10%. The quasi-exponential zone can be adjusted by Gamma or Log-normal distributions. On the other hand, Forbes classifies the G-2000 firms in 82 different industries or economic activities. Within each industry, the wealth distribution per employee also follows a two-class structure, but when the aggregate wealth of firms in each industry for the four metrics is divided by the total number of employees in that industry, then the 82 points of the aggregate wealth distribution by industry per employee can be well adjusted by quasi-exponential curves for the four metrics.
Power laws in citation distributions: evidence from Scopus.
Brzezinski, Michal
Modeling distributions of citations to scientific papers is crucial for understanding how science develops. However, there is a considerable empirical controversy on which statistical model fits the citation distributions best. This paper is concerned with rigorous empirical detection of power-law behaviour in the distribution of citations received by the most highly cited scientific papers. We have used a large, novel data set on citations to scientific papers published between 1998 and 2002 drawn from Scopus. The power-law model is compared with a number of alternative models using a likelihood ratio test. We have found that the power-law hypothesis is rejected for around half of the Scopus fields of science. For these fields of science, the Yule, power-law with exponential cut-off and log-normal distributions seem to fit the data better than the pure power-law model. On the other hand, when the power-law hypothesis is not rejected, it is usually empirically indistinguishable from most of the alternative models. The pure power-law model seems to be the best model only for the most highly cited papers in "Physics and Astronomy". Overall, our results seem to support theories implying that the most highly cited scientific papers follow the Yule, power-law with exponential cut-off or log-normal distribution. Our findings suggest also that power laws in citation distributions, when present, account only for a very small fraction of the published papers (less than 1 % for most of science fields) and that the power-law scaling parameter (exponent) is substantially higher (from around 3.2 to around 4.7) than found in the older literature.
Grain coarsening in two-dimensional phase-field models with an orientation field
NASA Astrophysics Data System (ADS)
Korbuly, Bálint; Pusztai, Tamás; Henry, Hervé; Plapp, Mathis; Apel, Markus; Gránásy, László
2017-05-01
In the literature, contradictory results have been published regarding the form of the limiting (long-time) grain size distribution (LGSD) that characterizes the late stage grain coarsening in two-dimensional and quasi-two-dimensional polycrystalline systems. While experiments and the phase-field crystal (PFC) model (a simple dynamical density functional theory) indicate a log-normal distribution, other works including theoretical studies based on conventional phase-field simulations that rely on coarse grained fields, like the multi-phase-field (MPF) and orientation field (OF) models, yield significantly different distributions. In a recent work, we have shown that the coarse grained phase-field models (whether MPF or OF) yield very similar limiting size distributions that seem to differ from the theoretical predictions. Herein, we revisit this problem, and demonstrate in the case of OF models [R. Kobayashi, J. A. Warren, and W. C. Carter, Physica D 140, 141 (2000), 10.1016/S0167-2789(00)00023-3; H. Henry, J. Mellenthin, and M. Plapp, Phys. Rev. B 86, 054117 (2012), 10.1103/PhysRevB.86.054117] that an insufficient resolution of the small angle grain boundaries leads to a log-normal distribution close to those seen in the experiments and the molecular scale PFC simulations. Our paper indicates, furthermore, that the LGSD is critically sensitive to the details of the evaluation process, and raises the possibility that the differences among the LGSD results from different sources may originate from differences in the detection of small angle grain boundaries.
A Maximum Likelihood Ensemble Data Assimilation Method Tailored to the Inner Radiation Belt
NASA Astrophysics Data System (ADS)
Guild, T. B.; O'Brien, T. P., III; Mazur, J. E.
2014-12-01
The Earth's radiation belts are composed of energetic protons and electrons whose fluxes span many orders of magnitude, whose distributions are log-normal, and where data-model differences can be large and also log-normal. This physical system thus challenges standard data assimilation methods relying on underlying assumptions of Gaussian distributions of measurements and data-model differences, where innovations to the model are small. We have therefore developed a data assimilation method tailored to these properties of the inner radiation belt, analogous to the ensemble Kalman filter but for the unique cases of non-Gaussian model and measurement errors, and non-linear model and measurement distributions. We apply this method to the inner radiation belt proton populations, using the SIZM inner belt model [Selesnick et al., 2007] and SAMPEX/PET and HEO proton observations to select the most likely ensemble members contributing to the state of the inner belt. We will describe the algorithm, the method of generating ensemble members, our choice of minimizing the difference between instrument counts not phase space densities, and demonstrate the method with our reanalysis of the inner radiation belt throughout solar cycle 23. We will report on progress to continue our assimilation into solar cycle 24 using the Van Allen Probes/RPS observations.
Abuasbi, Falastine; Lahham, Adnan; Abdel-Raziq, Issam Rashid
2018-05-01
In this study, levels of extremely low-frequency electric and magnetic fields originated from overhead power lines were investigated in the outdoor environment in Ramallah city, Palestine. Spot measurements were applied to record fields intensities over 6-min period. The Spectrum Analyzer NF-5035 was used to perform measurements at 1 m above ground level and directly underneath 40 randomly selected power lines distributed fairly within the city. Levels of electric fields varied depending on the line's category (power line, transformer or distributor), a minimum mean electric field of 3.9 V/m was found under a distributor line, and a maximum of 769.4 V/m under a high-voltage power line (66 kV). However, results of electric fields showed a log-normal distribution with the geometric mean and the geometric standard deviation of 35.9 and 2.8 V/m, respectively. Magnetic fields measured at power lines, on contrast, were not log-normally distributed; the minimum and maximum mean magnetic fields under power lines were 0.89 and 3.5 μT, respectively. As a result, none of the measured fields exceeded the ICNIRP's guidelines recommended for general public exposures to extremely low-frequency fields.
Design of a sampling plan to detect ochratoxin A in green coffee.
Vargas, E A; Whitaker, T B; Dos Santos, E A; Slate, A B; Lima, F B; Franca, R C A
2006-01-01
The establishment of maximum limits for ochratoxin A (OTA) in coffee by importing countries requires that coffee-producing countries develop scientifically based sampling plans to assess OTA contents in lots of green coffee before coffee enters the market thus reducing consumer exposure to OTA, minimizing the number of lots rejected, and reducing financial loss for producing countries. A study was carried out to design an official sampling plan to determine OTA in green coffee produced in Brazil. Twenty-five lots of green coffee (type 7 - approximately 160 defects) were sampled according to an experimental protocol where 16 test samples were taken from each lot (total of 16 kg) resulting in a total of 800 OTA analyses. The total, sampling, sample preparation, and analytical variances were 10.75 (CV = 65.6%), 7.80 (CV = 55.8%), 2.84 (CV = 33.7%), and 0.11 (CV = 6.6%), respectively, assuming a regulatory limit of 5 microg kg(-1) OTA and using a 1 kg sample, Romer RAS mill, 25 g sub-samples, and high performance liquid chromatography. The observed OTA distribution among the 16 OTA sample results was compared to several theoretical distributions. The 2 parameter-log normal distribution was selected to model OTA test results for green coffee as it gave the best fit across all 25 lot distributions. Specific computer software was developed using the variance and distribution information to predict the probability of accepting or rejecting coffee lots at specific OTA concentrations. The acceptation probability was used to compute an operating characteristic (OC) curve specific to a sampling plan design. The OC curve was used to predict the rejection of good lots (sellers' or exporters' risk) and the acceptance of bad lots (buyers' or importers' risk).
Evaluation of Low-Gravity Smoke Particulate for Spacecraft Fire Detection
NASA Technical Reports Server (NTRS)
Urban, David; Ruff, Gary A.; Mulholland George; Meyer, Marit; Yuan, Zeng guang; Cleary, Thomas; Yang, Jiann; Greenberg, Paul; Bryg, Victoria
2013-01-01
Tests were conducted on the International Space Station to evaluate the smoke particulate size from materials and conditions that are typical of those expected in spacecraft fires. Five different materials representative of those found in spacecraft (Teflon, Kapton, cotton, silicone rubber and Pyrell) were heated to temperatures below the ignition point with conditions controlled to provide repeatable sample surface temperatures and air flow. The air flow past the sample during the heating period ranged from quiescent to 8 cm/s. The effective transport time to the measurement instruments was varied from 11 to 800 seconds to simulate different smoke transport conditions in spacecraft. The resultant aerosol was evaluated by three instruments which measured different moments of the particle size distribution. These moment diagnostics were used to determine the particle number concentration (zeroth moment), the diameter concentration (first moment), and the mass concentration (third moment). These statistics were combined to determine the diameter of average mass and the count mean diameter and by assuming a log-normal distribution, the geometric mean diameter and the geometric standard deviations were also calculated. Smoke particle samples were collected on TEM grids using a thermal precipitator for post flight analysis. The TEM grids were analyzed to determine the particle morphology and shape parameters. The different materials produced particles with significantly different morphologies. Overall the majority of the average smoke particle sizes were found to be in the 200 to 400 nanometer range with the quiescent cases and the cases with increased transport time typically producing with substantially larger particles. The results varied between materials but the smoke particles produced in low gravity were typically twice the size of particles produced in normal gravity. These results can be used to establish design requirements for future spacecraft smoke detectors.
Formation and evolution of magnetised filaments in wind-swept turbulent clumps
NASA Astrophysics Data System (ADS)
Banda-Barragan, Wladimir Eduardo; Federrath, Christoph; Crocker, Roland M.; Bicknell, Geoffrey Vincent; Parkin, Elliot Ross
2015-08-01
Using high-resolution three-dimensional simulations, we examine the formation and evolution of filamentary structures arising from magnetohydrodynamic interactions between supersonic winds and turbulent clumps in the interstellar medium. Previous numerical studies assumed homogenous density profiles, null velocity fields, and uniformly distributed magnetic fields as the initial conditions for interstellar clumps. Here, we have, for the first time, incorporated fractal clumps with log-normal density distributions, random velocity fields and turbulent magnetic fields (superimposed on top of a uniform background field). Disruptive processes, instigated by dynamical instabilities and akin to those observed in simulations with uniform media, lead to stripping of clump material and the subsequent formation of filamentary tails. The evolution of filaments in uniform and turbulent models is, however, radically different as evidenced by comparisons of global quantities in both scenarios. We show, for example, that turbulent clumps produce tails with higher velocity dispersions, increased gas mixing, greater kinetic energy, and lower plasma beta than their uniform counterparts. We attribute the observed differences to: 1) the turbulence-driven enhanced growth of dynamical instabilities (e.g. Kelvin-Helmholtz and Rayleigh-Taylor instabilities) at fluid interfaces, and 2) the localised amplification of magnetic fields caused by the stretching of field lines trapped in the numerous surface deformations of fractal clumps. We briefly discuss the implications of this work to the physics of the optical filaments observed in the starburst galaxy M82.
Predicting durations of online collective actions based on Peaks' heights
NASA Astrophysics Data System (ADS)
Lu, Peng; Nie, Shizhao; Wang, Zheng; Jing, Ziwei; Yang, Jianwu; Qi, Zhongxiang; Pujia, Wangmo
2018-02-01
Capturing the whole process of collective actions, the peak model contains four stages, including Prepare, Outbreak, Peak, and Vanish. Based on the peak model, one of the key variables, factors and parameters are further investigated in this paper, which is the rate between peaks and spans. Although the durations or spans and peaks' heights are highly diversified, it seems that the ratio between them is quite stable. If the rate's regularity is discovered, we can predict how long the collective action lasts and when it ends based on the peak's height. In this work, we combined mathematical simulations and empirical big data of 148 cases to explore the regularity of ratio's distribution. It is indicated by results of simulations that the rate has some regularities of distribution, which is not normal distribution. The big data has been collected from the 148 online collective actions and the whole processes of participation are recorded. The outcomes of empirical big data indicate that the rate seems to be closer to being log-normally distributed. This rule holds true for both the total cases and subgroups of 148 online collective actions. The Q-Q plot is applied to check the normal distribution of the rate's logarithm, and the rate's logarithm does follow the normal distribution.
NASA Technical Reports Server (NTRS)
Nagpal, Vinod K.
1988-01-01
The effects of actual variations, also called uncertainties, in geometry and material properties on the structural response of a space shuttle main engine turbopump blade are evaluated. A normal distribution was assumed to represent the uncertainties statistically. Uncertainties were assumed to be totally random, partially correlated, and fully correlated. The magnitude of these uncertainties were represented in terms of mean and variance. Blade responses, recorded in terms of displacements, natural frequencies, and maximum stress, was evaluated and plotted in the form of probabilistic distributions under combined uncertainties. These distributions provide an estimate of the range of magnitudes of the response and probability of occurrence of a given response. Most importantly, these distributions provide the information needed to estimate quantitatively the risk in a structural design.
Green Lumber Grade Yields for Subfactory Class Hardwood Logs
Leland F. Hanks; Leland F. Hanks
1973-01-01
Data on lumber grade yields for subfactory class logs are presented for ten species of hardwoods. Eogs of this type are expected to assume greater importance in the market. The yields, when coupled with lumber prices, will be useful to sawmill operators for developing log prices in terms of standard factory lumber.
Sá, Rui Carlos; Henderson, A Cortney; Simonson, Tatum; Arai, Tatsuya J; Wagner, Harrieth; Theilmann, Rebecca J; Wagner, Peter D; Prisk, G Kim; Hopkins, Susan R
2017-07-01
We have developed a novel functional proton magnetic resonance imaging (MRI) technique to measure regional ventilation-perfusion (V̇ A /Q̇) ratio in the lung. We conducted a comparison study of this technique in healthy subjects ( n = 7, age = 42 ± 16 yr, Forced expiratory volume in 1 s = 94% predicted), by comparing data measured using MRI to that obtained from the multiple inert gas elimination technique (MIGET). Regional ventilation measured in a sagittal lung slice using Specific Ventilation Imaging was combined with proton density measured using a fast gradient-echo sequence to calculate regional alveolar ventilation, registered with perfusion images acquired using arterial spin labeling, and divided on a voxel-by-voxel basis to obtain regional V̇ A /Q̇ ratio. LogSDV̇ and LogSDQ̇, measures of heterogeneity derived from the standard deviation (log scale) of the ventilation and perfusion vs. V̇ A /Q̇ ratio histograms respectively, were calculated. On a separate day, subjects underwent study with MIGET and LogSDV̇ and LogSDQ̇ were calculated from MIGET data using the 50-compartment model. MIGET LogSDV̇ and LogSDQ̇ were normal in all subjects. LogSDQ̇ was highly correlated between MRI and MIGET (R = 0.89, P = 0.007); the intercept was not significantly different from zero (-0.062, P = 0.65) and the slope did not significantly differ from identity (1.29, P = 0.34). MIGET and MRI measures of LogSDV̇ were well correlated (R = 0.83, P = 0.02); the intercept differed from zero (0.20, P = 0.04) and the slope deviated from the line of identity (0.52, P = 0.01). We conclude that in normal subjects, there is a reasonable agreement between MIGET measures of heterogeneity and those from proton MRI measured in a single slice of lung. NEW & NOTEWORTHY We report a comparison of a new proton MRI technique to measure regional V̇ A /Q̇ ratio against the multiple inert gas elimination technique (MIGET). The study reports good relationships between measures of heterogeneity derived from MIGET and those derived from MRI. Although currently limited to a single slice acquisition, these data suggest that single sagittal slice measures of V̇ A /Q̇ ratio provide an adequate means to assess heterogeneity in the normal lung. Copyright © 2017 the American Physiological Society.
NASA Astrophysics Data System (ADS)
Skaugen, Thomas; Weltzien, Ingunn H.
2016-09-01
Snow is an important and complicated element in hydrological modelling. The traditional catchment hydrological model with its many free calibration parameters, also in snow sub-models, is not a well-suited tool for predicting conditions for which it has not been calibrated. Such conditions include prediction in ungauged basins and assessing hydrological effects of climate change. In this study, a new model for the spatial distribution of snow water equivalent (SWE), parameterized solely from observed spatial variability of precipitation, is compared with the current snow distribution model used in the operational flood forecasting models in Norway. The former model uses a dynamic gamma distribution and is called Snow Distribution_Gamma, (SD_G), whereas the latter model has a fixed, calibrated coefficient of variation, which parameterizes a log-normal model for snow distribution and is called Snow Distribution_Log-Normal (SD_LN). The two models are implemented in the parameter parsimonious rainfall-runoff model Distance Distribution Dynamics (DDD), and their capability for predicting runoff, SWE and snow-covered area (SCA) is tested and compared for 71 Norwegian catchments. The calibration period is 1985-2000 and validation period is 2000-2014. Results show that SDG better simulates SCA when compared with MODIS satellite-derived snow cover. In addition, SWE is simulated more realistically in that seasonal snow is melted out and the building up of "snow towers" and giving spurious positive trends in SWE, typical for SD_LN, is prevented. The precision of runoff simulations using SDG is slightly inferior, with a reduction in Nash-Sutcliffe and Kling-Gupta efficiency criterion of 0.01, but it is shown that the high precision in runoff prediction using SD_LN is accompanied with erroneous simulations of SWE.
Spectral Density of Laser Beam Scintillation in Wind Turbulence. Part 1; Theory
NASA Technical Reports Server (NTRS)
Balakrishnan, A. V.
1997-01-01
The temporal spectral density of the log-amplitude scintillation of a laser beam wave due to a spatially dependent vector-valued crosswind (deterministic as well as random) is evaluated. The path weighting functions for normalized spectral moments are derived, and offer a potential new technique for estimating the wind velocity profile. The Tatarskii-Klyatskin stochastic propagation equation for the Markov turbulence model is used with the solution approximated by the Rytov method. The Taylor 'frozen-in' hypothesis is assumed for the dependence of the refractive index on the wind velocity, and the Kolmogorov spectral density is used for the refractive index field.
A Bayesian approach to meta-analysis of plant pathology studies.
Mila, A L; Ngugi, H K
2011-01-01
Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.
Pan-European comparison of candidate distributions for climatological drought indices, SPI and SPEI
NASA Astrophysics Data System (ADS)
Stagge, James; Tallaksen, Lena; Gudmundsson, Lukas; Van Loon, Anne; Stahl, Kerstin
2013-04-01
Drought indices are vital to objectively quantify and compare drought severity, duration, and extent across regions with varied climatic and hydrologic regimes. The Standardized Precipitation Index (SPI), a well-reviewed meterological drought index recommended by the WMO, and its more recent water balance variant, the Standardized Precipitation-Evapotranspiration Index (SPEI) both rely on selection of univariate probability distributions to normalize the index, allowing for comparisons across climates. The SPI, considered a universal meteorological drought index, measures anomalies in precipitation, whereas the SPEI measures anomalies in climatic water balance (precipitation minus potential evapotranspiration), a more comprehensive measure of water availability that incorporates temperature. Many reviewers recommend use of the gamma (Pearson Type III) distribution for SPI normalization, while developers of the SPEI recommend use of the three parameter log-logistic distribution, based on point observation validation. Before the SPEI can be implemented at the pan-European scale, it is necessary to further validate the index using a range of candidate distributions to determine sensitivity to distribution selection, identify recommended distributions, and highlight those instances where a given distribution may not be valid. This study rigorously compares a suite of candidate probability distributions using WATCH Forcing Data, a global, historical (1958-2001) climate dataset based on ERA40 reanalysis with 0.5 x 0.5 degree resolution and bias-correction based on CRU-TS2.1 observations. Using maximum likelihood estimation, alternative candidate distributions are fit for the SPI and SPEI across the range of European climate zones. When evaluated at this scale, the gamma distribution for the SPI results in negatively skewed values, exaggerating the index severity of extreme dry conditions, while decreasing the index severity of extreme high precipitation. This bias is particularly notable for shorter aggregation periods (1-6 months) during the summer months in southern Europe (below 45° latitude), and can partially be attributed to distribution fitting difficulties in semi-arid regions where monthly precipitation totals cluster near zero. By contrast, the SPEI has potential for avoiding this fitting difficulty because it is not bounded by zero. However, the recommended log-logistic distribution produces index values with less variation than the standard normal distribution. Among the alternative candidate distributions, the best fit distribution and the distribution parameters vary in space and time, suggesting regional commonalities within hydroclimatic regimes, as discussed further in the presentation.
Lee, M.W.; Collett, T.S.
2012-01-01
High-quality logging-while-drilling (LWD) downhole logs were acquired in seven wells drilled during the Gulf of MexicoGasHydrateJointIndustryProjectLegII in the spring of 2009. Well logs obtained in one of the wells, the GreenCanyon Block 955Hwell (GC955-H), indicate that a 27.4-m thick zone at the depth of 428 m below sea floor (mbsf; 1404 feet below sea floor (fbsf)) contains gashydrate within sand with average gashydrate saturations estimated at 60% from the compressional-wave (P-wave) velocity and 65% (locally more than 80%) from resistivity logs if the gashydrate is assumed to be uniformly distributed in this mostly sand-rich section. Similar analysis, however, of log data from a shallow clay-rich interval between 183 and 366 mbsf (600 and 1200 fbsf) yielded average gashydrate saturations of about 20% from the resistivity log (locally 50-60%) and negligible amounts of gashydrate from the P-wave velocity logs. Differences in saturations estimated between resistivity and P-wave velocities within the upper clay-rich interval are caused by the nature of the gashydrate occurrences. In the case of the shallow clay-rich interval, gashydrate fills vertical (or high angle) fractures in rather than fillingpore space in sands. In this study, isotropic and anisotropic resistivity and velocity models are used to analyze the occurrence of gashydrate within both the clay-rich and sand dominated gas-hydrate-bearing reservoirs in the GC955-Hwell.
Traas, T P; Luttik, R; Jongbloed, R H
1996-08-01
In previous studies, the risk of toxicant accumulation in food chains was used to calculate quality criteria for surface water and soil. A simple algorithm was used to calculate maximum permissable concentrations [MPC = no-observed-effect concentration/bioconcentration factor(NOEC/BCF)]. These studies were limited to simple food chains. This study presents a method to calculate MPCs for more complex food webs of predators. The previous method is expanded. First, toxicity data (NOECs) for several compounds were corrected for differences between laboratory animals and animals in the wild. Second, for each compound, it was assumed these NOECs were a sample of a log-logistic distribution of mammalian and avian NOECs. Third, bioaccumulation factors (BAFs) for major food items of predators were collected and were assumed to derive from different log-logistic distributions of BAFs. Fourth, MPCs for each compound were calculated using Monte Carlo sampling from NOEC and BAF distributions. An uncertainty analysis for cadmium was performed to identify the most uncertain parameters of the model. Model analysis indicated that most of the prediction uncertainty of the model can be ascribed to uncertainty of species sensitivity as expressed by NOECs. A very small proportion of model uncertainty is contributed by BAFs from food webs. Correction factors for the conversion of NOECs from laboratory conditions to the field have some influence on the final value of MPC5, but the total prediction uncertainty of the MPC is quite large. It is concluded that the uncertainty in species sensitivity is quite large. To avoid unethical toxicity testing with mammalian or avian predators, it cannot be avoided to use this uncertainty in the method proposed to calculate MPC distributions. The fifth percentile of the MPC is suggested as a safe value for top predators.
Proton Straggling in Thick Silicon Detectors
NASA Technical Reports Server (NTRS)
Selesnick, R. S.; Baker, D. N.; Kanekal, S. G.
2017-01-01
Straggling functions for protons in thick silicon radiation detectors are computed by Monte Carlo simulation. Mean energy loss is constrained by the silicon stopping power, providing higher straggling at low energy and probabilities for stopping within the detector volume. By matching the first four moments of simulated energy-loss distributions, straggling functions are approximated by a log-normal distribution that is accurate for Vavilov k is greater than or equal to 0:3. They are verified by comparison to experimental proton data from a charged particle telescope.
On the Use of the Log-Normal Particle Size Distribution to Characterize Global Rain
NASA Technical Reports Server (NTRS)
Meneghini, Robert; Rincon, Rafael; Liao, Liang
2003-01-01
Although most parameterizations of the drop size distributions (DSD) use the gamma function, there are several advantages to the log-normal form, particularly if we want to characterize the large scale space-time variability of the DSD and rain rate. The advantages of the distribution are twofold: the logarithm of any moment can be expressed as a linear combination of the individual parameters of the distribution; the parameters of the distribution are approximately normally distributed. Since all radar and rainfall-related parameters can be written approximately as a moment of the DSD, the first property allows us to express the logarithm of any radar/rainfall variable as a linear combination of the individual DSD parameters. Another consequence is that any power law relationship between rain rate, reflectivity factor, specific attenuation or water content can be expressed in terms of the covariance matrix of the DSD parameters. The joint-normal property of the DSD parameters has applications to the description of the space-time variation of rainfall in the sense that any radar-rainfall quantity can be specified by the covariance matrix associated with the DSD parameters at two arbitrary space-time points. As such, the parameterization provides a means by which we can use the spaceborne radar-derived DSD parameters to specify in part the covariance matrices globally. However, since satellite observations have coarse temporal sampling, the specification of the temporal covariance must be derived from ancillary measurements and models. Work is presently underway to determine whether the use of instantaneous rain rate data from the TRMM Precipitation Radar can provide good estimates of the spatial correlation in rain rate from data collected in 5(sup 0)x 5(sup 0) x 1 month space-time boxes. To characterize the temporal characteristics of the DSD parameters, disdrometer data are being used from the Wallops Flight Facility site where as many as 4 disdrometers have been used to acquire data over a 2 km path. These data should help quantify the temporal form of the covariance matrix at this site.
Integrating models that depend on variable data
NASA Astrophysics Data System (ADS)
Banks, A. T.; Hill, M. C.
2016-12-01
Models of human-Earth systems are often developed with the goal of predicting the behavior of one or more dependent variables from multiple independent variables, processes, and parameters. Often dependent variable values range over many orders of magnitude, which complicates evaluation of the fit of the dependent variable values to observations. Many metrics and optimization methods have been proposed to address dependent variable variability, with little consensus being achieved. In this work, we evaluate two such methods: log transformation (based on the dependent variable being log-normally distributed with a constant variance) and error-based weighting (based on a multi-normal distribution with variances that tend to increase as the dependent variable value increases). Error-based weighting has the advantage of encouraging model users to carefully consider data errors, such as measurement and epistemic errors, while log-transformations can be a black box for typical users. Placing the log-transformation into the statistical perspective of error-based weighting has not formerly been considered, to the best of our knowledge. To make the evaluation as clear and reproducible as possible, we use multiple linear regression (MLR). Simulations are conducted with MatLab. The example represents stream transport of nitrogen with up to eight independent variables. The single dependent variable in our example has values that range over 4 orders of magnitude. Results are applicable to any problem for which individual or multiple data types produce a large range of dependent variable values. For this problem, the log transformation produced good model fit, while some formulations of error-based weighting worked poorly. Results support previous suggestions fthat error-based weighting derived from a constant coefficient of variation overemphasizes low values and degrades model fit to high values. Applying larger weights to the high values is inconsistent with the log-transformation. Greater consistency is obtained by imposing smaller (by up to a factor of 1/35) weights on the smaller dependent-variable values. From an error-based perspective, the small weights are consistent with large standard deviations. This work considers the consequences of these two common ways of addressing variable data.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
NASA Astrophysics Data System (ADS)
Márquez, I.; Lima Neto, G. B.; Capelato, H.; Durret, F.; Lanzoni, B.; Gerbal, D.
2001-12-01
In the present paper, we show that elliptical galaxies (Es) obey a scaling relation between potential energy and mass. Since they are relaxed systems in a post violent-relaxation stage, they are quasi-equilibrium gravitational systems and therefore they also have a quasi-constant specific entropy. Assuming that light traces mass, these two laws imply that in the space defined by the three Sérsic law parameters (intensity Sigma0 , scale a and shape nu ), elliptical galaxies are distributed on two intersecting 2-manifolds: the Entropic Surface and the Energy-Mass Surface. Using a sample of 132 galaxies belonging to three nearby clusters, we have verified that ellipticals indeed follow these laws. This also implies that they are distributed along the intersection line (the Energy-Entropy line), thus they constitute a one-parameter family. These two physical laws (separately or combined), allow to find the theoretical origin of several observed photometrical relations, such as the correlation between absolute magnitude and effective surface brightness, and the fact that ellipticals are located on a surface in the [log Reff, -2.5 log Sigma0, log nu ] space. The fact that elliptical galaxies are a one-parameter family has important implications for cosmology and galaxy formation and evolution models. Moreover, the Energy-Entropy line could be used as a distance indicator.
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
NASA Astrophysics Data System (ADS)
Zhang, H.; Harter, T.; Sivakumar, B.
2005-12-01
Facies-based geostatistical models have become important tools for the stochastic analysis of flow and transport processes in heterogeneous aquifers. However, little is known about the dependency of these processes on the parameters of facies- based geostatistical models. This study examines the nonpoint source solute transport normal to the major bedding plane in the presence of interconnected high conductivity (coarse- textured) facies in the aquifer medium and the dependence of the transport behavior upon the parameters of the constitutive facies model. A facies-based Markov chain geostatistical model is used to quantify the spatial variability of the aquifer system hydrostratigraphy. It is integrated with a groundwater flow model and a random walk particle transport model to estimate the solute travel time probability distribution functions (pdfs) for solute flux from the water table to the bottom boundary (production horizon) of the aquifer. The cases examined include, two-, three-, and four-facies models with horizontal to vertical facies mean length anisotropy ratios, ek, from 25:1 to 300:1, and with a wide range of facies volume proportions (e.g, from 5% to 95% coarse textured facies). Predictions of travel time pdfs are found to be significantly affected by the number of hydrostratigraphic facies identified in the aquifer, the proportions of coarse-textured sediments, the mean length of the facies (particularly the ratio of length to thickness of coarse materials), and - to a lesser degree - the juxtapositional preference among the hydrostratigraphic facies. In transport normal to the sedimentary bedding plane, travel time pdfs are not log- normally distributed as is often assumed. Also, macrodispersive behavior (variance of the travel time pdf) was found to not be a unique function of the conductivity variance. The skewness of the travel time pdf varied from negatively skewed to strongly positively skewed within the parameter range examined. We also show that the Markov chain approach may give significantly different travel time pdfs when compared to the more commonly used Gaussian random field approach even though the first and second order moments in the geostatistical distribution of the lnK field are identical. The choice of the appropriate geostatistical model is therefore critical in the assessment of nonpoint source transport.
NASA Technical Reports Server (NTRS)
Kuntz, K. D.; White, Nicolas E. (Technical Monitor)
2001-01-01
In order to isolate the diffuse extragalactic component of the soft X-ray background, we have used a combination of ROSAT All-Sky Survey and IRAS 100 micron data to separate the soft X-ray background into five components. We find a Local Hot Bubble similar to that described by Snowden et al (1998). We make a first calculation of the contribution by unresolved Galactic stars to the diffuse background. We constrain the normalization of the Extragalactic Power Law (the contribution of the unresolved extragalactic point sources such as AGN, QSO'S, and normal galaxies) to 9.5 +/- 0.9 keV/(sq cm s sr keV), assuming a power-law index of 1.46. We show that the remaining emission, which is some combination of Galactic halo emission and the putative diffuse extragalactic emission, must be composed of at least two components which we have characterized by thermal spectra. The softer component has log T - 6.08 and a patchy distribution; thus it is most probably part of the Galactic halo. The harder component has log T - 6.46 and is nearly isotropic; some portion may be due to the Galactic halo and some portion may be due to the diffuse extragalactic emission. The maximum upper limit to the strength of the emission by the diffuse extragalactic component is the total of the hard component, approx. 7.4 +/- 1.0 keV/(sq cm s sr keV) in the 3/4 keV band. We have made the first direct measure of the fluctuations due to the diffuse extragalactic emission in the 3/4 keV band. Physical arguments suggest that small angular scale (approx. 10') fluctuations in the Local Hot Bubble or the Galactic halo will have very short dissipation times (about 10(exp 5) years). Therefore, the fluctuation spectrum of the soft X-ray background should measure the distribution of the diffuse extragalactic emission. Using mosaics of deep, overlapping PSPC pointings, we find an autocorrelation function value of approx. 0.0025 for 10' < theta < 20', and a value consistent with zero on larger scales. Measurement of the fluctuations with a delta I/I method produces consistent results.
A frequency quantum interpretation of the surface renewal model of mass transfer
Mondal, Chanchal
2017-01-01
The surface of a turbulent liquid is visualized as consisting of a large number of chaotic eddies or liquid elements. Assuming that surface elements of a particular age have renewal frequencies that are integral multiples of a fundamental frequency quantum, and further assuming that the renewal frequency distribution is of the Boltzmann type, performing a population balance for these elements leads to the Danckwerts surface age distribution. The basic quantum is what has been traditionally called the rate of surface renewal. The Higbie surface age distribution follows if the renewal frequency distribution of such elements is assumed to be continuous. Four age distributions, which reflect different start-up conditions of the absorption process, are then used to analyse transient physical gas absorption into a large volume of liquid, assuming negligible gas-side mass-transfer resistance. The first two are different versions of the Danckwerts model, the third one is based on the uniform and Higbie distributions, while the fourth one is a mixed distribution. For the four cases, theoretical expressions are derived for the rates of gas absorption and dissolved-gas transfer to the bulk liquid. Under transient conditions, these two rates are not equal and have an inverse relationship. However, with the progress of absorption towards steady state, they approach one another. Assuming steady-state conditions, the conventional one-parameter Danckwerts age distribution is generalized to a two-parameter age distribution. Like the two-parameter logarithmic normal distribution, this distribution can also capture the bell-shaped nature of the distribution of the ages of surface elements observed experimentally in air–sea gas and heat exchange. Estimates of the liquid-side mass-transfer coefficient made using these two distributions for the absorption of hydrogen and oxygen in water are very close to one another and are comparable to experimental values reported in the literature. PMID:28791137
Internal log scanning: Research to reality
Daniel L. Schmoldt
2000-01-01
Improved log breakdown into lumber has been an active research topic since the 1960's. Demonstrated economic gains have driven the search for a cost-effective method to scan logs internally, from which it is assumed one can chose a better breakdown strategy. X-ray computed tomography (CT) has been widely accepted as the most promising internal imaging technique....
Bladder cancer mapping in Libya based on standardized morbidity ratio and log-normal model
NASA Astrophysics Data System (ADS)
Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley
2017-05-01
Disease mapping contains a set of statistical techniques that detail maps of rates based on estimated mortality, morbidity, and prevalence. A traditional approach to measure the relative risk of the disease is called Standardized Morbidity Ratio (SMR). It is the ratio of an observed and expected number of accounts in an area, which has the greatest uncertainty if the disease is rare or if geographical area is small. Therefore, Bayesian models or statistical smoothing based on Log-normal model are introduced which might solve SMR problem. This study estimates the relative risk for bladder cancer incidence in Libya from 2006 to 2007 based on the SMR and log-normal model, which were fitted to data using WinBUGS software. This study starts with a brief review of these models, starting with the SMR method and followed by the log-normal model, which is then applied to bladder cancer incidence in Libya. All results are compared using maps and tables. The study concludes that the log-normal model gives better relative risk estimates compared to the classical method. The log-normal model has can overcome the SMR problem when there is no observed bladder cancer in an area.
NASA Astrophysics Data System (ADS)
Annila, Arto
2016-02-01
The principle of increasing entropy is derived from statistical physics of open systems assuming that quanta of actions, as undividable basic build blocks, embody everything. According to this tenet, all systems evolve from one state to another either by acquiring quanta from their surroundings or by discarding quanta to the surroundings in order to attain energetic balance in least time. These natural processes result in ubiquitous scale-free patterns: skewed distributions that accumulate in a sigmoid manner and hence span log-log scales mostly as straight lines. Moreover, the equation for least-time motions reveals that evolution is by nature a non-deterministic process. Although the obtained insight in thermodynamics from the notion of quanta in motion yields nothing new, it accentuates that contemporary comprehension is impaired when modeling evolution as a computable process by imposing conservation of energy and thereby ignoring that quantum of actions are the carriers of energy from the system to its surroundings.
Reliable and More Powerful Methods for Power Analysis in Structural Equation Modeling
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Zhang, Zhiyong; Zhao, Yanyun
2017-01-01
The normal-distribution-based likelihood ratio statistic T[subscript ml] = nF[subscript ml] is widely used for power analysis in structural Equation modeling (SEM). In such an analysis, power and sample size are computed by assuming that T[subscript ml] follows a central chi-square distribution under H[subscript 0] and a noncentral chi-square…
Abuasbi, Falastine; Lahham, Adnan; Abdel-Raziq, Issam Rashid
2018-04-01
This study was focused on the measurement of residential exposure to power frequency (50-Hz) electric and magnetic fields in the city of Ramallah-Palestine. A group of 32 semi-randomly selected residences distributed amongst the city were under investigations of fields variations. Measurements were performed with the Spectrum Analyzer NF-5035 and were carried out at one meter above ground level in the residence's bedroom or living room under both zero and normal-power conditions. Fields' variations were recorded over 6-min and some times over few hours. Electric fields under normal-power use were relatively low; ~59% of residences experienced mean electric fields <10 V/m. The highest mean electric field of 66.9 V/m was found at residence R27. However, electric field values were log-normally distributed with geometric mean and geometric standard deviation of 9.6 and 3.5 V/m, respectively. Background electric fields measured under zero-power use, were very low; ~80% of residences experienced background electric fields <1 V/m. Under normal-power use, the highest mean magnetic field (0.45 μT) was found at residence R26 where an indoor power substation exists. However, ~81% of residences experienced mean magnetic fields <0.1 μT. Magnetic fields measured inside the 32 residences showed also a log-normal distribution with geometric mean and geometric standard deviation of 0.04 and 3.14 μT, respectively. Under zero-power conditions, ~7% of residences experienced average background magnetic field >0.1 μT. Fields from appliances showed a maximum mean electric field of 67.4 V/m from hair dryer, and maximum mean magnetic field of 13.7 μT from microwave oven. However, no single result surpassed the ICNIRP limits for general public exposures to ELF fields, but still, the interval 0.3-0.4 μT for possible non-thermal health impacts of exposure to ELF magnetic fields, was experienced in 13% of the residences.
NASA Astrophysics Data System (ADS)
Rock, N. M. S.
ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures helps to detect errors in data as well as to assess data-distributions themselves.
Miller, Robert; Plessow, Franziska
2013-06-01
Endocrine time series often lack normality and homoscedasticity most likely due to the non-linear dynamics of their natural determinants and the immanent characteristics of the biochemical analysis tools, respectively. As a consequence, data transformation (e.g., log-transformation) is frequently applied to enable general linear model-based analyses. However, to date, data transformation techniques substantially vary across studies and the question of which is the optimum power transformation remains to be addressed. The present report aims to provide a common solution for the analysis of endocrine time series by systematically comparing different power transformations with regard to their impact on data normality and homoscedasticity. For this, a variety of power transformations of the Box-Cox family were applied to salivary cortisol data of 309 healthy participants sampled in temporal proximity to a psychosocial stressor (the Trier Social Stress Test). Whereas our analyses show that un- as well as log-transformed data are inferior in terms of meeting normality and homoscedasticity, they also provide optimum transformations for both, cross-sectional cortisol samples reflecting the distributional concentration equilibrium and longitudinal cortisol time series comprising systematically altered hormone distributions that result from simultaneously elicited pulsatile change and continuous elimination processes. Considering these dynamics of endocrine oscillations, data transformation prior to testing GLMs seems mandatory to minimize biased results. Copyright © 2012 Elsevier Ltd. All rights reserved.
Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Yamada, Hiroshi; Miyake, Hirotsugu; Furukawa, Toshiaki A; Furukaw, Toshiaki A
2016-01-01
In a previous study, we reported that the distribution of total depressive symptoms scores according to the Center for Epidemiologic Studies Depression Scale (CES-D) in a general population is stable throughout middle adulthood and follows an exponential pattern except for at the lowest end of the symptom score. Furthermore, the individual distributions of 16 negative symptom items of the CES-D exhibit a common mathematical pattern. To confirm the reproducibility of these findings, we investigated the distribution of total depressive symptoms scores and 16 negative symptom items in a sample of Japanese employees. We analyzed 7624 employees aged 20-59 years who had participated in the Northern Japan Occupational Health Promotion Centers Collaboration Study for Mental Health. Depressive symptoms were assessed using the CES-D. The CES-D contains 20 items, each of which is scored in four grades: "rarely," "some," "much," and "most of the time." The descriptive statistics and frequency curves of the distributions were then compared according to age group. The distribution of total depressive symptoms scores appeared to be stable from 30-59 years. The right tail of the distribution for ages 30-59 years exhibited a linear pattern with a log-normal scale. The distributions of the 16 individual negative symptom items of the CES-D exhibited a common mathematical pattern which displayed different distributions with a boundary at "some." The distributions of the 16 negative symptom items from "some" to "most" followed a linear pattern with a log-normal scale. The distributions of the total depressive symptoms scores and individual negative symptom items in a Japanese occupational setting show the same patterns as those observed in a general population. These results show that the specific mathematical patterns of the distributions of total depressive symptoms scores and individual negative symptom items can be reproduced in an occupational population.
Characterization of airborne particles in an open pit mining region.
Huertas, José I; Huertas, María E; Solís, Dora A
2012-04-15
We characterized airborne particle samples collected from 15 stations in operation since 2007 in one of the world's largest opencast coal mining regions. Using gravimetric, scanning electron microscopy (SEM-EDS), and X-ray photoelectron spectroscopy (XPS) analysis the samples were characterized in terms of concentration, morphology, particle size distribution (PSD), and elemental composition. All of the total suspended particulate (TSP) samples exhibited a log-normal PSD with a mean of d=5.46 ± 0.32 μm and σ(ln d)=0.61 ± 0.03. Similarly, all particles with an equivalent aerodynamic diameter less than 10 μm (PM(10)) exhibited a log-normal type distribution with a mean of d=3.6 ± 0.38 μm and σ(ln d)=0.55 ± 0.03. XPS analysis indicated that the main elements present in the particles were carbon, oxygen, potassium, and silicon with average mass concentrations of 41.5%, 34.7%, 11.6%, and 5.7% respectively. In SEM micrographs the particles appeared smooth-surfaced and irregular in shape, and tended to agglomerate. The particles were typically clay minerals, including limestone, calcite, quartz, and potassium feldspar. Copyright © 2012 Elsevier B.V. All rights reserved.
Exponential blocking-temperature distribution in ferritin extracted from magnetization measurements
NASA Astrophysics Data System (ADS)
Lee, T. H.; Choi, K.-Y.; Kim, G.-H.; Suh, B. J.; Jang, Z. H.
2014-11-01
We developed a direct method to extract the zero-field zero-temperature anisotropy energy barrier distribution of magnetic particles in the form of a blocking-temperature distribution. The key idea is to modify measurement procedures slightly to make nonequilibrium magnetization calculations (including the time evolution of magnetization) easier. We applied this method to the biomagnetic molecule ferritin and successfully reproduced field-cool magnetization by using the extracted distribution. We find that the resulting distribution is more like an exponential type and that the distribution cannot be correlated simply to the widely known log-normal particle-size distribution. The method also allows us to determine the values of the zero-temperature coercivity and Bloch coefficient, which are in good agreement with those determined from other techniques.
A comparison of minimum distance and maximum likelihood techniques for proportion estimation
NASA Technical Reports Server (NTRS)
Woodward, W. A.; Schucany, W. R.; Lindsey, H.; Gray, H. L.
1982-01-01
The estimation of mixing proportions P sub 1, P sub 2,...P sub m in the mixture density f(x) = the sum of the series P sub i F sub i(X) with i = 1 to M is often encountered in agricultural remote sensing problems in which case the p sub i's usually represent crop proportions. In these remote sensing applications, component densities f sub i(x) have typically been assumed to be normally distributed, and parameter estimation has been accomplished using maximum likelihood (ML) techniques. Minimum distance (MD) estimation is examined as an alternative to ML where, in this investigation, both procedures are based upon normal components. Results indicate that ML techniques are superior to MD when component distributions actually are normal, while MD estimation provides better estimates than ML under symmetric departures from normality. When component distributions are not symmetric, however, it is seen that neither of these normal based techniques provides satisfactory results.
Evaluation and validity of a LORETA normative EEG database.
Thatcher, R W; North, D; Biver, C
2005-04-01
To evaluate the reliability and validity of a Z-score normative EEG database for Low Resolution Electromagnetic Tomography (LORETA), EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) were acquired from 106 normal subjects, and the cross-spectrum was computed and multiplied by the Key Institute's LORETA 2,394 gray matter pixel T Matrix. After a log10 transform or a Box-Cox transform the mean and standard deviation of the *.lor files were computed for each of the 2394 gray matter pixels, from 1 to 30 Hz, for each of the subjects. Tests of Gaussianity were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of a Z-score database was computed by measuring the approximation to a Gaussian distribution. The validity of the LORETA normative database was evaluated by the degree to which confirmed brain pathologies were localized using the LORETA normative database. Log10 and Box-Cox transforms approximated Gaussian distribution in the range of 95.64% to 99.75% accuracy. The percentage of normative Z-score values at 2 standard deviations ranged from 1.21% to 3.54%, and the percentage of Z-scores at 3 standard deviations ranged from 0% to 0.83%. Left temporal lobe epilepsy, right sensory motor hematoma and a right hemisphere stroke exhibited maximum Z-score deviations in the same locations as the pathologies. We conclude: (1) Adequate approximation to a Gaussian distribution can be achieved using LORETA by using a log10 transform or a Box-Cox transform and parametric statistics, (2) a Z-Score normative database is valid with adequate sensitivity when using LORETA, and (3) the Z-score LORETA normative database also consistently localized known pathologies to the expected Brodmann areas as an hypothesis test based on the surface EEG before computing LORETA.
Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G
2012-10-01
Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.
Mataragas, M; Alessandria, V; Rantsiou, K; Cocolin, L
2015-08-01
In the present work, a demonstration is made on how the risk from the presence of Listeria monocytogenes in fermented sausages can be managed using the concept of Food Safety Objective (FSO) aided by stochastic modeling (Bayesian analysis and Monte Carlo simulation) and meta-analysis. For this purpose, the ICMSF equation was used, which combines the initial level (H0) of the hazard and its subsequent reduction (ΣR) and/or increase (ΣI) along the production chain. Each element of the equation was described by a distribution to investigate the effect not only of the level of the hazard, but also the effect of the accompanying variability. The distribution of each element was determined by Bayesian modeling (H0) and meta-analysis (ΣR and ΣI). The output was a normal distribution N(-5.36, 2.56) (log cfu/g) from which the percentage of the non-conforming products, i.e. the fraction above the FSO of 2 log cfu/g, was estimated at 0.202%. Different control measures were examined such as lowering initial L. monocytogenes level and inclusion of an additional killing step along the process resulting in reduction of the non-conforming products from 0.195% to 0.003% based on the mean and/or square-root change of the normal distribution, and 0.001%, respectively. Copyright © 2015 Elsevier Ltd. All rights reserved.
Parameter estimation and forecasting for multiplicative log-normal cascades.
Leövey, Andrés E; Lux, Thomas
2012-04-01
We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing et al. [Physica D 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica D 193, 195 (2004)] and Kiyono et al. [Phys. Rev. E 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono et al.'s procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.
On the probability distribution function of the mass surface density of molecular clouds. I
NASA Astrophysics Data System (ADS)
Fischera, Jörg
2014-05-01
The probability distribution function (PDF) of the mass surface density is an essential characteristic of the structure of molecular clouds or the interstellar medium in general. Observations of the PDF of molecular clouds indicate a composition of a broad distribution around the maximum and a decreasing tail at high mass surface densities. The first component is attributed to the random distribution of gas which is modeled using a log-normal function while the second component is attributed to condensed structures modeled using a simple power-law. The aim of this paper is to provide an analytical model of the PDF of condensed structures which can be used by observers to extract information about the condensations. The condensed structures are considered to be either spheres or cylinders with a truncated radial density profile at cloud radius rcl. The assumed profile is of the form ρ(r) = ρc/ (1 + (r/r0)2)n/ 2 for arbitrary power n where ρc and r0 are the central density and the inner radius, respectively. An implicit function is obtained which either truncates (sphere) or has a pole (cylinder) at maximal mass surface density. The PDF of spherical condensations and the asymptotic PDF of cylinders in the limit of infinite overdensity ρc/ρ(rcl) flattens for steeper density profiles and has a power law asymptote at low and high mass surface densities and a well defined maximum. The power index of the asymptote Σ- γ of the logarithmic PDF (ΣP(Σ)) in the limit of high mass surface densities is given by γ = (n + 1)/(n - 1) - 1 (spheres) or by γ = n/ (n - 1) - 1 (cylinders in the limit of infinite overdensity). Appendices are available in electronic form at http://www.aanda.org
Shin, Jung-Hyun; Eom, Tae-Hoon; Kim, Young-Hoon; Chung, Seung-Yun; Lee, In-Goo; Kim, Jung-Min
2017-07-01
Valproate (VPA) is an antiepileptic drug (AED) used for initial monotherapy in treating childhood absence epilepsy (CAE). EEG might be an alternative approach to explore the effects of AEDs on the central nervous system. We performed a comparative analysis of background EEG activity during VPA treatment by using standardized, low-resolution, brain electromagnetic tomography (sLORETA) to explore the effect of VPA in patients with CAE. In 17 children with CAE, non-parametric statistical analyses using sLORETA were performed to compare the current density distribution of four frequency bands (delta, theta, alpha, and beta) between the untreated and treated condition. Maximum differences in current density were found in the left inferior frontal gyrus for the delta frequency band (log-F-ratio = -1.390, P > 0.05), the left medial frontal gyrus for the theta frequency band (log-F-ratio = -0.940, P > 0.05), the left inferior frontal gyrus for the alpha frequency band (log-F-ratio = -0.590, P > 0.05), and the left anterior cingulate for the beta frequency band (log-F-ratio = -1.318, P > 0.05). However, none of these differences were significant (threshold log-F-ratio = ±1.888, P < 0.01; threshold log-F-ratio = ±1.722, P < 0.05). Because EEG background is accepted as normal in CAE, VPA would not be expected to significantly change abnormal thalamocortical oscillations on a normal EEG background. Therefore, our results agree with currently accepted concepts but are not consistent with findings in some previous studies.
Bayesian methods for uncertainty factor application for derivation of reference values.
Simon, Ted W; Zhu, Yiliang; Dourson, Michael L; Beck, Nancy B
2016-10-01
In 2014, the National Research Council (NRC) published Review of EPA's Integrated Risk Information System (IRIS) Process that considers methods EPA uses for developing toxicity criteria for non-carcinogens. These criteria are the Reference Dose (RfD) for oral exposure and Reference Concentration (RfC) for inhalation exposure. The NRC Review suggested using Bayesian methods for application of uncertainty factors (UFs) to adjust the point of departure dose or concentration to a level considered to be without adverse effects for the human population. The NRC foresaw Bayesian methods would be potentially useful for combining toxicity data from disparate sources-high throughput assays, animal testing, and observational epidemiology. UFs represent five distinct areas for which both adjustment and consideration of uncertainty may be needed. NRC suggested UFs could be represented as Bayesian prior distributions, illustrated the use of a log-normal distribution to represent the composite UF, and combined this distribution with a log-normal distribution representing uncertainty in the point of departure (POD) to reflect the overall uncertainty. Here, we explore these suggestions and present a refinement of the methodology suggested by NRC that considers each individual UF as a distribution. From an examination of 24 evaluations from EPA's IRIS program, when individual UFs were represented using this approach, the geometric mean fold change in the value of the RfD or RfC increased from 3 to over 30, depending on the number of individual UFs used and the sophistication of the assessment. We present example calculations and recommendations for implementing the refined NRC methodology. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raabe, O.G.; Goldman, M.
Since data on the pulmonary toxicity of plutonium in people are not available, estimates must be based upon available experimental animal data. For this purpose, inhalation studies with beagle dogs exposed to aerosols of /sup 238/PuO/sub 2/ and /sup 239/PuO/sub 2/ were analyzed and a simple model has been proposed to describe apparent dose-response relationships. It was found that for each aerosol and radionuclide form, the cumulative absorbed lung dose that leads to death from lung damage up to 1000 days could be assumed to have a log-normal distribution of values that was independent of time to death. The datamore » was satisfactorily fit to a model in which the time of death postexposure is given by: t = (K/D), with the time to death, the cumulative dose to lung tissue (the killing dose), and anti D the average dose rate to lung tissue from time of exposure to death. The ratios of median K values, normalized to the value for /sup 90/Sr--Y FAP, indicate a relative biological effectiveness (RBE) of 14 for /sup 239/PuO/sub 2/ particles and 5 for /sup 238/PuO/sub 2/ particles. This demonstrates an effect of particle specific activity on relative biological effectiveness for early mortality, since an increase in specific activity of particles leads to a lower apparent RBE.« less
NASA Astrophysics Data System (ADS)
Aguirre, E. E.; Karchewski, B.
2017-12-01
DC resistivity surveying is a geophysical method that quantifies the electrical properties of the subsurface of the earth by applying a source current between two electrodes and measuring potential differences between electrodes at known distances from the source. Analytical solutions for a homogeneous half-space and simple subsurface models are well known, as the former is used to define the concept of apparent resistivity. However, in situ properties are heterogeneous meaning that simple analytical models are only an approximation, and ignoring such heterogeneity can lead to misinterpretation of survey results costing time and money. The present study examines the extent to which random variations in electrical properties (i.e. electrical conductivity) affect potential difference readings and therefore apparent resistivities, relative to an assumed homogeneous subsurface model. We simulate the DC resistivity survey using a Finite Difference (FD) approximation of an appropriate simplification of Maxwell's equations implemented in Matlab. Electrical resistivity values at each node in the simulation were defined as random variables with a given mean and variance, and are assumed to follow a log-normal distribution. The Monte Carlo analysis for a given variance of electrical resistivity was performed until the mean and variance in potential difference measured at the surface converged. Finally, we used the simulation results to examine the relationship between variance in resistivity and variation in surface potential difference (or apparent resistivity) relative to a homogeneous half-space model. For relatively low values of standard deviation in the material properties (<10% of mean), we observed a linear correlation between variance of resistivity and variance in apparent resistivity.
NASA Astrophysics Data System (ADS)
El-Khadragy, A. A.; Shazly, T. F.; AlAlfy, I. M.; Ramadan, M.; El-Sawy, M. Z.
2018-06-01
An exploration method has been developed using surface and aerial gamma-ray spectral measurements in prospecting petroleum in stratigraphic and structural traps. The Gulf of Suez is an important region for studying hydrocarbon potentiality in Egypt. Thorium normalization technique was applied on the sandstone reservoirs in the region to determine the hydrocarbon potentialities zones using the three spectrometric radioactive gamma ray-logs (eU, eTh and K% logs). This method was applied on the recorded gamma-ray spectrometric logs for Rudeis and Kareem Formations in Ras Ghara oil Field, Gulf of Suez, Egypt. The conventional well logs (gamma-ray, resistivity, neutron, density and sonic logs) were analyzed to determine the net pay zones in the study area. The agreement ratios between the thorium normalization technique and the results of the well log analyses are high, so the application of thorium normalization technique can be used as a guide for hydrocarbon accumulation in the study reservoir rocks.
Characterizing the topology of probabilistic biological networks.
Todor, Andrei; Dobra, Alin; Kahveci, Tamer
2013-01-01
Biological interactions are often uncertain events, that may or may not take place with some probability. This uncertainty leads to a massive number of alternative interaction topologies for each such network. The existing studies analyze the degree distribution of biological networks by assuming that all the given interactions take place under all circumstances. This strong and often incorrect assumption can lead to misleading results. In this paper, we address this problem and develop a sound mathematical basis to characterize networks in the presence of uncertain interactions. Using our mathematical representation, we develop a method that can accurately describe the degree distribution of such networks. We also take one more step and extend our method to accurately compute the joint-degree distributions of node pairs connected by edges. The number of possible network topologies grows exponentially with the number of uncertain interactions. However, the mathematical model we develop allows us to compute these degree distributions in polynomial time in the number of interactions. Our method works quickly even for entire protein-protein interaction (PPI) networks. It also helps us find an adequate mathematical model using MLE. We perform a comparative study of node-degree and joint-degree distributions in two types of biological networks: the classical deterministic networks and the more flexible probabilistic networks. Our results confirm that power-law and log-normal models best describe degree distributions for both probabilistic and deterministic networks. Moreover, the inverse correlation of degrees of neighboring nodes shows that, in probabilistic networks, nodes with large number of interactions prefer to interact with those with small number of interactions more frequently than expected. We also show that probabilistic networks are more robust for node-degree distribution computation than the deterministic ones. all the data sets used, the software implemented and the alignments found in this paper are available at http://bioinformatics.cise.ufl.edu/projects/probNet/.
Šmarda, Petr; Bureš, Petr; Horová, Lucie
2007-01-01
Background and Aims The spatial and statistical distribution of genome sizes and the adaptivity of genome size to some types of habitat, vegetation or microclimatic conditions were investigated in a tetraploid population of Festuca pallens. The population was previously documented to vary highly in genome size and is assumed as a model for the study of the initial stages of genome size differentiation. Methods Using DAPI flow cytometry, samples were measured repeatedly with diploid Festuca pallens as the internal standard. Altogether 172 plants from 57 plots (2·25 m2), distributed in contrasting habitats over the whole locality in South Moravia, Czech Republic, were sampled. The differences in DNA content were confirmed by the double peaks of simultaneously measured samples. Key Results At maximum, a 1·115-fold difference in genome size was observed. The statistical distribution of genome sizes was found to be continuous and best fits the extreme (Gumbel) distribution with rare occurrences of extremely large genomes (positive-skewed), as it is similar for the log-normal distribution of the whole Angiosperms. Even plants from the same plot frequently varied considerably in genome size and the spatial distribution of genome sizes was generally random and unautocorrelated (P > 0·05). The observed spatial pattern and the overall lack of correlations of genome size with recognized vegetation types or microclimatic conditions indicate the absence of ecological adaptivity of genome size in the studied population. Conclusions These experimental data on intraspecific genome size variability in Festuca pallens argue for the absence of natural selection and the selective non-significance of genome size in the initial stages of genome size differentiation, and corroborate the current hypothetical model of genome size evolution in Angiosperms (Bennetzen et al., 2005, Annals of Botany 95: 127–132). PMID:17565968
Polynomial probability distribution estimation using the method of moments
Mattsson, Lars; Rydén, Jesper
2017-01-01
We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram–Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation. PMID:28394949
Polynomial probability distribution estimation using the method of moments.
Munkhammar, Joakim; Mattsson, Lars; Rydén, Jesper
2017-01-01
We suggest a procedure for estimating Nth degree polynomial approximations to unknown (or known) probability density functions (PDFs) based on N statistical moments from each distribution. The procedure is based on the method of moments and is setup algorithmically to aid applicability and to ensure rigor in use. In order to show applicability, polynomial PDF approximations are obtained for the distribution families Normal, Log-Normal, Weibull as well as for a bimodal Weibull distribution and a data set of anonymized household electricity use. The results are compared with results for traditional PDF series expansion methods of Gram-Charlier type. It is concluded that this procedure is a comparatively simple procedure that could be used when traditional distribution families are not applicable or when polynomial expansions of probability distributions might be considered useful approximations. In particular this approach is practical for calculating convolutions of distributions, since such operations become integrals of polynomial expressions. Finally, in order to show an advanced applicability of the method, it is shown to be useful for approximating solutions to the Smoluchowski equation.
Wavefront-Guided Scleral Lens Correction in Keratoconus
Marsack, Jason D.; Ravikumar, Ayeswarya; Nguyen, Chi; Ticak, Anita; Koenig, Darren E.; Elswick, James D.; Applegate, Raymond A.
2014-01-01
Purpose To examine the performance of state-of-the-art wavefront-guided scleral contact lenses (wfgSCLs) on a sample of keratoconic eyes, with emphasis on performance quantified with visual quality metrics; and to provide a detailed discussion of the process used to design, manufacture and evaluate wfgSCLs. Methods Fourteen eyes of 7 subjects with keratoconus were enrolled and a wfgSCL was designed for each eye. High-contrast visual acuity and visual quality metrics were used to assess the on-eye performance of the lenses. Results The wfgSCL provided statistically lower levels of both lower-order RMS (p < 0.001) and higher-order RMS (p < 0.02) than an intermediate spherical equivalent scleral contact lens. The wfgSCL provided lower levels of lower-order RMS than a normal group of well-corrected observers (p < < 0.001). However, the wfgSCL does not provide less higher-order RMS than the normal group (p = 0.41). Of the 14 eyes studied, 10 successfully reached the exit criteria, achieving residual higher-order root mean square wavefront error (HORMS) less than or within 1 SD of the levels experienced by normal, age-matched subjects. In addition, measures of visual image quality (logVSX, logNS and logLIB) for the 10 eyes were well distributed within the range of values seen in normal eyes. However, visual performance as measured by high contrast acuity did not reach normal, age-matched levels, which is in agreement with prior results associated with the acute application of wavefront correction to KC eyes. Conclusions Wavefront-guided scleral contact lenses are capable of optically compensating for the deleterious effects of higher-order aberration concomitant with the disease, and can provide visual image quality equivalent to that seen in normal eyes. Longer duration studies are needed to assess whether the visual system of the highly aberrated eye wearing a wfgSCL is capable of producing visual performance levels typical of the normal population. PMID:24830371
Lo, Kenneth
2011-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375
Lo, Kenneth; Gottardo, Raphael
2012-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
NASA Astrophysics Data System (ADS)
Posacki, Silvia; Cappellari, Michele; Treu, Tommaso; Pellegrini, Silvia; Ciotti, Luca
2015-01-01
We present an investigation about the shape of the initial mass function (IMF) of early-type galaxies (ETGs), based on a joint lensing and dynamical analysis, and on stellar population synthesis models, for a sample of 55 lens ETGs identified by the Sloan Lens Advanced Camera for Surveys (SLACS). We construct axisymmetric dynamical models based on the Jeans equations which allow for orbital anisotropy and include a dark matter halo. The models reproduce in detail the observed Hubble Space Telescope photometry and are constrained by the total projected mass within the Einstein radius and the stellar velocity dispersion (σ) within the Sloan Digital Sky Survey fibres. Comparing the dynamically-derived stellar mass-to-light ratios (M*/L)dyn, obtained for an assumed halo slope ρh ∝ r-1, to the stellar population ones (M*/L)Salp, derived from full-spectrum fitting and assuming a Salpeter IMF, we infer the mass normalization of the IMF. Our results confirm the previous analysis by the SLACS team that the mass normalization of the IMF of high-σ galaxies is consistent on average with a Salpeter slope. Our study allows for a fully consistent study of the trend between IMF and σ for both the SLACS and atlas3D samples, which explore quite different σ ranges. The two samples are highly complementary, the first being essentially σ selected, and the latter volume-limited and nearly mass selected. We find that the two samples merge smoothly into a single trend of the form log α = (0.38 ± 0.04) × log (σe/200 km s-1) + ( - 0.06 ± 0.01), where α = (M*/L)dyn/(M*/L)Salp and σe is the luminosity averaged σ within one effective radius Re. This is consistent with a systematic variation of the IMF normalization from Kroupa to Salpeter in the interval σe ≈ 90-270 km s-1.
Are star formation rates of galaxies bimodal?
NASA Astrophysics Data System (ADS)
Feldmann, Robert
2017-09-01
Star formation rate (SFR) distributions of galaxies are often assumed to be bimodal with modes corresponding to star-forming and quiescent galaxies, respectively. Both classes of galaxies are typically studied separately, and SFR distributions of star-forming galaxies are commonly modelled as lognormals. Using both observational data and results from numerical simulations, I argue that this division into star-forming and quiescent galaxies is unnecessary from a theoretical point of view and that the SFR distributions of the whole population can be well fitted by zero-inflated negative binomial distributions. This family of distributions has three parameters that determine the average SFR of the galaxies in the sample, the scatter relative to the star-forming sequence and the fraction of galaxies with zero SFRs, respectively. The proposed distributions naturally account for (I) the discrete nature of star formation, (II) the presence of 'dead' galaxies with zero SFRs and (III) asymmetric scatter. Excluding 'dead' galaxies, the distribution of log SFR is unimodal with a peak at the star-forming sequence and an extended tail towards low SFRs. However, uncertainties and biases in the SFR measurements can create the appearance of a bimodal distribution.
Leão, William L.; Chen, Ming-Hui
2017-01-01
A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor’s 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model. PMID:29333210
Leão, William L; Abanto-Valle, Carlos A; Chen, Ming-Hui
2017-01-01
A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor's 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model.
Devos, Stefanie; Cox, Bianca; van Lier, Tom; Nawrot, Tim S; Putman, Koen
2016-09-01
We used log-linear and log-log exposure-response (E-R) functions to model the association between PM2.5 exposure and non-elective hospitalizations for pneumonia, and estimated the attributable hospital costs by using the effect estimates obtained from both functions. We used hospital discharge data on 3519 non-elective pneumonia admissions from UZ Brussels between 2007 and 2012 and we combined a case-crossover design with distributed lag models. The annual averted pneumonia hospitalization costs for a reduction in PM2.5 exposure from the mean (21.4μg/m(3)) to the WHO guideline for annual mean PM2.5 (10μg/m(3)) were estimated and extrapolated for Belgium. Non-elective hospitalizations for pneumonia were significantly associated with PM2.5 exposure in both models. Using a log-linear E-R function, the estimated risk reduction for pneumonia hospitalization associated with a decrease in mean PM2.5 exposure to 10μg/m(3) was 4.9%. The corresponding estimate for the log-log model was 10.7%. These estimates translate to an annual pneumonia hospital cost saving in Belgium of €15.5 million and almost €34 million for the log-linear and log-log E-R function, respectively. Although further research is required to assess the shape of the association between PM2.5 exposure and pneumonia hospitalizations, we demonstrated that estimates for health effects and associated costs heavily depend on the assumed E-R function. These results are important for policy making, as supra-linear E-R associations imply that significant health benefits may still be obtained from additional pollution control measures in areas where PM levels have already been reduced. Copyright © 2016 Elsevier Ltd. All rights reserved.
An Empirical Study of Synchrophasor Communication Delay in a Utility TCP/IP Network
NASA Astrophysics Data System (ADS)
Zhu, Kun; Chenine, Moustafa; Nordström, Lars; Holmström, Sture; Ericsson, Göran
2013-07-01
Although there is a plethora of literature dealing with Phasor Measurement Unit (PMU) communication delay, there has not been any effort made to generalize empirical delay results by identifying the distribution with the best fit. The existing studies typically assume a distribution or simply build on analogies to communication network routing delay. Specifically, this study provides insight into the characterization of the communication delay of both unprocessed PMU data and synchrophasors sorted by a Phasor Data Concentrator (PDC). The results suggest that a bi-modal distribution containing two normal distributions offers the best fit of the delay of the unprocessed data, whereas the delay profile of the sorted synchrophasors resembles a normal distribution based on these results, the possibility of evaluating the reliability of a synchrophasor application with respect to a particular choice of PDC timeout is discussed.
Evaluation of bacterial run and tumble motility parameters through trajectory analysis
NASA Astrophysics Data System (ADS)
Liang, Xiaomeng; Lu, Nanxi; Chang, Lin-Ching; Nguyen, Thanh H.; Massoudieh, Arash
2018-04-01
In this paper, a method for extraction of the behavior parameters of bacterial migration based on the run and tumble conceptual model is described. The methodology is applied to the microscopic images representing the motile movement of flagellated Azotobacter vinelandii. The bacterial cells are considered to change direction during both runs and tumbles as is evident from the movement trajectories. An unsupervised cluster analysis was performed to fractionate each bacterial trajectory into run and tumble segments, and then the distribution of parameters for each mode were extracted by fitting mathematical distributions best representing the data. A Gaussian copula was used to model the autocorrelation in swimming velocity. For both run and tumble modes, Gamma distribution was found to fit the marginal velocity best, and Logistic distribution was found to represent better the deviation angle than other distributions considered. For the transition rate distribution, log-logistic distribution and log-normal distribution, respectively, was found to do a better job than the traditionally agreed exponential distribution. A model was then developed to mimic the motility behavior of bacteria at the presence of flow. The model was applied to evaluate its ability to describe observed patterns of bacterial deposition on surfaces in a micro-model experiment with an approach velocity of 200 μm/s. It was found that the model can qualitatively reproduce the attachment results of the micro-model setting.
NASA Astrophysics Data System (ADS)
Körding, E.; Colbert, E.; Falcke, H.
In recent years Ultra-Luminous X-Ray sources (ULXs) received wide attention, however, their true nature is not yet understood. Many explanations have been suggested, including intermediate-mass black holes, super-Eddington accretion flows, anisotropic emission, and relativistic beaming of microquasars. We model the logN-logS distribution of ULXs assuming that each neutron star or black hole XRB can be described by an accretion disk plus jet model, where the jet is relativistically beamed. The distribution can be either fit by intermediate-mass black holes or by stellar mass black holes with mildly relativistic jets. Even though the jet is intrinsically weaker than the accretion disk, relativistic beaming can in the latter approach lead to the high fluxes observed. To further explore the possibility of microblazars contributing to the ULX phenomenon, we have embarked on a radio-monitoring study of ULXs in nearby galaxies with the VLA. However, up to now no radio flare has been detected. Using the radio/X-ray correlation the upper limits on the radio flux can be converted into upper limits for the black hole masses of MBH ≲ 10^3 M⊙.
Paillet, Frederick L.
1985-01-01
Acoustic-waveform and acoustic-televiewer logs were obtained for a 400-meter interval of deeply buried basalt flows in three boreholes, and over shorter intervals in two additional boreholes located on the U.S. Department of Energy 's Hanford site in Benton County, Washington. Borehole-wall breakouts were observed in the unaltered interiors of a large part of individual basalt flows; however, several of the flows in one of the five boreholes had almost no breakouts. The distribution of breakouts observed on the televiewer logs correlated closely with the incidence of core disking in some intervals, but the correlation was not always perfect, perhaps because of the differences in the specific fracture mechanisms involved. Borehole-wall breakouts were consistently located on the east and west sides of the boreholes. The orientation is consistent with previous estimates of the principal horizontal-stress field in south-central Washington, if breakouts are assumed to form along the azimuth of the least principal stress. The distribution of breakouts repeatedly indicated an interval of breakout-free rock at the top and bottom of flows. Because breakouts frequently terminate at major low-angle fractures, the data indicate that fracturing may have relieved some of the horizontal stresses near flow tops and bottoms. Unaltered and unfractured basalt appeared to have a uniform compressional velocity of 6.0 + or - 0.1 km/sec and a uniform shear velocity of 3.35 + or - 0.1 km/sec throughout flow interiors. Acoustics-waveform logs also indicated that borehole-wall breakouts did not affect acoustic propagation along the borehole; so fracturing associated with the formation of breakouts appeared to be confined to a thin annulus of stress concentration around the borehole. Televiewer logs obtained before and after hydraulic fracturing in these boreholes indicated the extent of induced fractures, and also indicated minor changes to pre-existing fractures that may have been inflated during fracture generation. (USGS)
On the issues of probability distribution of GPS carrier phase observations
NASA Astrophysics Data System (ADS)
Luo, X.; Mayer, M.; Heck, B.
2009-04-01
In common practice the observables related to Global Positioning System (GPS) are assumed to follow a Gauss-Laplace normal distribution. Actually, full knowledge of the observables' distribution is not required for parameter estimation by means of the least-squares algorithm based on the functional relation between observations and unknown parameters as well as the associated variance-covariance matrix. However, the probability distribution of GPS observations plays a key role in procedures for quality control (e.g. outlier and cycle slips detection, ambiguity resolution) and in reliability-related assessments of the estimation results. Under non-ideal observation conditions with respect to the factors impacting GPS data quality, for example multipath effects and atmospheric delays, the validity of the normal distribution postulate of GPS observations is in doubt. This paper presents a detailed analysis of the distribution properties of GPS carrier phase observations using double difference residuals. For this purpose 1-Hz observation data from the permanent SAPOS
Correlation between size distribution and luminescence properties of spool-shaped InAs quantum dots
NASA Astrophysics Data System (ADS)
Xie, H.; Prioli, R.; Torelly, G.; Liu, H.; Fischer, A. M.; Jakomin, R.; Mourão, R.; Kawabata, R.; Pires, M. P.; Souza, P. L.; Ponce, F. A.
2017-05-01
InAs QDs embedded in an AlGaAs matrix have been produced by MOVPE with a partial capping and annealing technique to achieve controllable QD energy levels that could be useful for solar cell applications. The resulted spool-shaped QDs are around 5 nm in height and have a log-normal diameter distribution, which is observed by TEM to range from 5 to 15 nm. Two photoluminescence peaks associated with QD emission are attributed to the ground and the first excited states transitions. The luminescence peak width is correlated with the distribution of QD diameters through the diameter dependent QD energy levels.
Hot gas in the cold dark matter scenario: X-ray clusters from a high-resolution numerical simulation
NASA Technical Reports Server (NTRS)
Kang, Hyesung; Cen, Renyue; Ostriker, Jeremiah P.; Ryu, Dongsu
1994-01-01
A new, three-dimensional, shock-capturing hydrodynamic code is utilized to determine the distribution of hot gas in a standard cold dark matter (CDM) model of the universe. Periodic boundary conditions are assumed: a box with size 85 h(exp -1) Mpc having cell size 0.31 h(exp -1) Mpc is followed in a simulation with 270(exp 3) = 10(exp 7.3) cells. Adopting standard parameters determined from COBE and light-element nucleosynthesis, sigma(sub 8) = 1.05, omega(sub b) = 0.06, and assuming h = 0.5, we find the X-ray-emitting clusters and compute the luminosity function at several wavelengths, the temperature distribution, and estimated sizes, as well as the evolution of these quantities with redshift. We find that most of the total X-ray emissivity in our box originates in a relatively small number of identifiable clusters which occupy approximately 10(exp -3) of the box volume. This standard CDM model, normalized to COBE, produces approximately 5 times too much emission from clusters having L(sub x) is greater than 10(exp 43) ergs/s, a not-unexpected result. If all other parameters were unchanged, we would expect adequate agreement for sigma(sub 8) = 0.6. This provides a new and independent argument for lower small-scale power than standard CDM at the 8 h(exp -1) Mpc scale. The background radiation field at 1 keV due to clusters in this model is approximately one-third of the observed background, which, after correction for numerical effects, again indicates approximately 5 times too much emission and the appropriateness of sigma(sub 8) = 0.6. If we have used the observed ratio of gas to total mass in clusters, rather than basing the mean density on light-element nucleosynthesis, then the computed luminosity of each cluster would have increased still further, by a factor of approximately 10. The number density of clusters increases to z approximately 1, but the luminosity per typical cluster decreases, with the result that evolution in the number density of bright clusters is moderate in this redshift range, showing a broad peak near z = 0.7, and then a rapid decline above redshift z = 3. Detailed computations of the luminosity functions in the range L(sub x) = 10(exp 40) - 10(exp 44) ergs/s in various energy bands are presented for both cluster central regions and total luminosities to be used in comparison with ROSAT and other observational data sets. The quantitative results found disagree significantly with those found by other investigators using semianalytic techniques. We find little dependence of core radius on cluster luminosity and a dependence of temperature on luminosity given by log kT(sub x) = A + B log L(sub x), which is slightly steeper (B = 0.38) than is indicated by observations. Computed temperatures are somewhat higher than observed, as expected, in that COBE-normalized CDM has too much power on the relevant scales. A modest average temperature gradient is found, with temperatures dropping to 90% of central values at 0.4 h(exp -1) Mpc and 70% of central values at 0.9 h(exp -1) Mpc. Examining the ratio of gas to total mass in the clusters normalized to Omega(sub B) h(exp 2) = 0.015, and comparing with observations, we conclude, in agreement with White (1991), that the cluster observations argue for an open universe.
APPROXIMATION AND ESTIMATION OF s-CONCAVE DENSITIES VIA RÉNYI DIVERGENCES.
Han, Qiyang; Wellner, Jon A
2016-01-01
In this paper, we study the approximation and estimation of s -concave densities via Rényi divergence. We first show that the approximation of a probability measure Q by an s -concave density exists and is unique via the procedure of minimizing a divergence functional proposed by [ Ann. Statist. 38 (2010) 2998-3027] if and only if Q admits full-dimensional support and a first moment. We also show continuity of the divergence functional in Q : if Q n → Q in the Wasserstein metric, then the projected densities converge in weighted L 1 metrics and uniformly on closed subsets of the continuity set of the limit. Moreover, directional derivatives of the projected densities also enjoy local uniform convergence. This contains both on-the-model and off-the-model situations, and entails strong consistency of the divergence estimator of an s -concave density under mild conditions. One interesting and important feature for the Rényi divergence estimator of an s -concave density is that the estimator is intrinsically related with the estimation of log-concave densities via maximum likelihood methods. In fact, we show that for d = 1 at least, the Rényi divergence estimators for s -concave densities converge to the maximum likelihood estimator of a log-concave density as s ↗ 0. The Rényi divergence estimator shares similar characterizations as the MLE for log-concave distributions, which allows us to develop pointwise asymptotic distribution theory assuming that the underlying density is s -concave.
APPROXIMATION AND ESTIMATION OF s-CONCAVE DENSITIES VIA RÉNYI DIVERGENCES
Han, Qiyang; Wellner, Jon A.
2017-01-01
In this paper, we study the approximation and estimation of s-concave densities via Rényi divergence. We first show that the approximation of a probability measure Q by an s-concave density exists and is unique via the procedure of minimizing a divergence functional proposed by [Ann. Statist. 38 (2010) 2998–3027] if and only if Q admits full-dimensional support and a first moment. We also show continuity of the divergence functional in Q: if Qn → Q in the Wasserstein metric, then the projected densities converge in weighted L1 metrics and uniformly on closed subsets of the continuity set of the limit. Moreover, directional derivatives of the projected densities also enjoy local uniform convergence. This contains both on-the-model and off-the-model situations, and entails strong consistency of the divergence estimator of an s-concave density under mild conditions. One interesting and important feature for the Rényi divergence estimator of an s-concave density is that the estimator is intrinsically related with the estimation of log-concave densities via maximum likelihood methods. In fact, we show that for d = 1 at least, the Rényi divergence estimators for s-concave densities converge to the maximum likelihood estimator of a log-concave density as s ↗ 0. The Rényi divergence estimator shares similar characterizations as the MLE for log-concave distributions, which allows us to develop pointwise asymptotic distribution theory assuming that the underlying density is s-concave. PMID:28966410
NASA Astrophysics Data System (ADS)
Bartiko, Daniel; Chaffe, Pedro; Bonumá, Nadia
2017-04-01
Floods may be strongly affected by climate, land-use, land-cover and water infrastructure changes. However, it is common to model this process as stationary. This approach has been questioned, especially when it involves estimate of the frequency and magnitude of extreme events for designing and maintaining hydraulic structures, as those responsible for flood control and dams safety. Brazil is the third largest producer of hydroelectricity in the world and many of the country's dams are located in the Southern Region. So, it seems appropriate to investigate the presence of non-stationarity in the affluence in these plants. In our study, we used historical flood data from the Brazilian National Grid Operator (ONS) to explore trends in annual maxima in river flow of the 38 main rivers flowing to Southern Brazilian reservoirs (records range from 43 to 84 years). In the analysis, we assumed a two-parameter log-normal distribution a linear regression model was applied in order to allow for the mean to vary with time. We computed recurrence reduction factors to characterize changes in the return period of an initially estimated 100 year-flood by a log-normal stationary model. To evaluate whether or not a particular site exhibits positive trend, we only considered data series with linear regression slope coefficients that exhibit significance levels (p<0,05). The significance level was calculated using the one-sided Student's test. The trend model residuals were analyzed using the Anderson-Darling normality test, the Durbin-Watson test for the independence and the Breusch-Pagan test for heteroscedasticity. Our results showed that 22 of the 38 data series analyzed have a significant positive trend. The trends were mainly in three large basins: Iguazu, Uruguay and Paranapanema, which suffered changes in land use and flow regularization in the last years. The calculated return period for the series that presented positive trend varied from 50 to 77 years for a 100 year-flood estimated by stationary model when considering a planning horizon equal to ten years. We conclude that attention should be given for future projects developed in this area, including the incorporation of non-stationarity analysis, search for answers to such changes and incorporation of new data to increase the reliability of the estimates.
Mesner, Larry D.; Valsakumar, Veena; Karnani, Neerja; Dutta, Anindya; Hamlin, Joyce L.; Bekiranov, Stefan
2011-01-01
We have used a novel bubble-trapping procedure to construct nearly pure and comprehensive human origin libraries from early S- and log-phase HeLa cells, and from log-phase GM06990, a karyotypically normal lymphoblastoid cell line. When hybridized to ENCODE tiling arrays, these libraries illuminated 15.3%, 16.4%, and 21.8% of the genome in the ENCODE regions, respectively. Approximately half of the origin fragments cluster into zones, and their signals are generally higher than those of isolated fragments. Interestingly, initiation events are distributed about equally between genic and intergenic template sequences. While only 13.2% and 14.0% of genes within the ENCODE regions are actually transcribed in HeLa and GM06990 cells, 54.5% and 25.6% of zonal origin fragments overlap transcribed genes, most with activating chromatin marks in their promoters. Our data suggest that cell synchronization activates a significant number of inchoate origins. In addition, HeLa and GM06990 cells activate remarkably different origin populations. Finally, there is only moderate concordance between the log-phase HeLa bubble map and published maps of small nascent strands for this cell line. PMID:21173031
Zhang, Peng; Luo, Dandan; Li, Pengfei; Sharpsten, Lucie; Medeiros, Felipe A.
2015-01-01
Glaucoma is a progressive disease due to damage in the optic nerve with associated functional losses. Although the relationship between structural and functional progression in glaucoma is well established, there is disagreement on how this association evolves over time. In addressing this issue, we propose a new class of non-Gaussian linear-mixed models to estimate the correlations among subject-specific effects in multivariate longitudinal studies with a skewed distribution of random effects, to be used in a study of glaucoma. This class provides an efficient estimation of subject-specific effects by modeling the skewed random effects through the log-gamma distribution. It also provides more reliable estimates of the correlations between the random effects. To validate the log-gamma assumption against the usual normality assumption of the random effects, we propose a lack-of-fit test using the profile likelihood function of the shape parameter. We apply this method to data from a prospective observation study, the Diagnostic Innovations in Glaucoma Study, to present a statistically significant association between structural and functional change rates that leads to a better understanding of the progression of glaucoma over time. PMID:26075565
Reconstruction of doses and deposition in the western trace from the Chernobyl accident.
Sikkeland, T; Skuterud, L; Goltsova, N I; Lindmo, T
1997-05-01
A model is presented for the explosive cloud of particulates that produced the western trace of high radioactive ground contamination in the Chernobyl accident on 26 April 1986. The model was developed to reproduce measured dose rates and nuclide contamination and to relate estimated doses to observed changes in: (1) infrared emission from the foliage and (2) morphological and histological structures of individual pines. Dominant factors involved in ground contamination were initial cloud shape, particle size distribution, and rate of particle fallout. At time of formation, the cloud was assumed to be parabolical and to contain a homogeneous distribution of spherically shaped fuel particulates having a log-normal size distribution. The particulates were dispersed by steady winds and diffusion that produced a straight line deposition path. The analysis indicates that two clouds, denoted by Cloud I and Cloud II, were involved. Fallout from the former dominated the far field region and fallout from latter the region near the reactor. At formation they had a full width at half maximum of 1800 m and 500 m, respectively. For wind velocities of 5-10 m s(-1) the particulates' radial distribution at formation had a standard deviation and mode of 1.8 microm and 0.5 microm, respectively. This distribution corresponds to a release of 390 GJ in the runaway explosion. The clouds' height and mass are not uniquely determined but are coupled together. For an initial height of 3,600 m, Cloud I contained about 400 kg fuel. For Cloud II the values were, respectively, 1,500 m and 850 kg. Loss of activities from the clouds is found to be small. Values are obtained for the rate of radionuclide migration from the deposit. Various types of biological damage to pines, as reported in the literature, are shown to be mainly due to ionizing radiation from the deposit by Cloud II. A formula is presented for the particulate size distribution in the trace area.
The magnetic field and abundance distribution geometry of the peculiar A star 53 Camelopardalis
NASA Astrophysics Data System (ADS)
Landstreet, J. D.
1988-03-01
New spectra have been obtained of the magnetic Ap star 53 Cam, well spaced through its 8.03 day rotation period, covering the spectral regions λλ3900 - 3960 and 4250 - 4315. These data, and previously obtained Hβ Zeeman analyzer observations of the longitudinal field strength, have been used to derive models of the magnetic field geometry and the abundance distributions of Ca, Cr, Fe, Sr, and Ti. The models have been obtained by use of a new line synthesis program that incorporates the effects of an assumed magnetic field and abundance distribution into the calculation of line profiles. Calculated profiles are compared with observations. The model is used to derive a radius of R/R_sun; = 2.3±0.4, a luminosity of log L/L_sun; = 1.4±0.17, and a mass of M/M_sun; = 2.0±0.3 for 53 Cam.
Superstatistical generalised Langevin equation: non-Gaussian viscoelastic anomalous diffusion
NASA Astrophysics Data System (ADS)
Ślęzak, Jakub; Metzler, Ralf; Magdziarz, Marcin
2018-02-01
Recent advances in single particle tracking and supercomputing techniques demonstrate the emergence of normal or anomalous, viscoelastic diffusion in conjunction with non-Gaussian distributions in soft, biological, and active matter systems. We here formulate a stochastic model based on a generalised Langevin equation in which non-Gaussian shapes of the probability density function and normal or anomalous diffusion have a common origin, namely a random parametrisation of the stochastic force. We perform a detailed analysis demonstrating how various types of parameter distributions for the memory kernel result in exponential, power law, or power-log law tails of the memory functions. The studied system is also shown to exhibit a further unusual property: the velocity has a Gaussian one point probability density but non-Gaussian joint distributions. This behaviour is reflected in the relaxation from a Gaussian to a non-Gaussian distribution observed for the position variable. We show that our theoretical results are in excellent agreement with stochastic simulations.
The uncertainty of nitrous oxide emissions from grazed grasslands: A New Zealand case study
NASA Astrophysics Data System (ADS)
Kelliher, Francis M.; Henderson, Harold V.; Cox, Neil R.
2017-01-01
Agricultural soils emit nitrous oxide (N2O), a greenhouse gas and the primary source of nitrogen oxides which deplete stratospheric ozone. Agriculture has been estimated to be the largest anthropogenic N2O source. In New Zealand (NZ), pastoral agriculture uses half the land area. To estimate the annual N2O emissions from NZ's agricultural soils, the nitrogen (N) inputs have been determined and multiplied by an emission factor (EF), the mass fraction of N inputs emitted as N2Osbnd N. To estimate the associated uncertainty, we developed an analytical method. For comparison, another estimate was determined by Monte Carlo numerical simulation. For both methods, expert judgement was used to estimate the N input uncertainty. The EF uncertainty was estimated by meta-analysis of the results from 185 NZ field trials. For the analytical method, assuming a normal distribution and independence of the terms used to calculate the emissions (correlation = 0), the estimated 95% confidence limit was ±57%. When there was a normal distribution and an estimated correlation of 0.4 between N input and EF, the latter inferred from experimental data involving six NZ soils, the analytical method estimated a 95% confidence limit of ±61%. The EF data from 185 NZ field trials had a logarithmic normal distribution. For the Monte Carlo method, assuming a logarithmic normal distribution for EF, a normal distribution for the other terms and independence of all terms, the estimated 95% confidence limits were -32% and +88% or ±60% on average. When there were the same distribution assumptions and a correlation of 0.4 between N input and EF, the Monte Carlo method estimated 95% confidence limits were -34% and +94% or ±64% on average. For the analytical and Monte Carlo methods, EF uncertainty accounted for 95% and 83% of the emissions uncertainty when the correlation between N input and EF was 0 and 0.4, respectively. As the first uncertainty analysis of an agricultural soils N2O emissions inventory using "country-specific" field trials to estimate EF uncertainty, this can be a potentially informative case study for the international scientific community.
Estimating the Classification Efficiency of a Test Battery.
ERIC Educational Resources Information Center
De Corte, Wilfried
2000-01-01
Shows how a theorem proven by H. Brogden (1951, 1959) can be used to estimate the allocation average (a predictor based classification of a test battery) assuming that the predictor intercorrelations and validities are known and that the predictor variables have a joint multivariate normal distribution. (SLD)
Power law versus exponential state transition dynamics: application to sleep-wake architecture.
Chu-Shore, Jesse; Westover, M Brandon; Bianchi, Matt T
2010-12-02
Despite the common experience that interrupted sleep has a negative impact on waking function, the features of human sleep-wake architecture that best distinguish sleep continuity versus fragmentation remain elusive. In this regard, there is growing interest in characterizing sleep architecture using models of the temporal dynamics of sleep-wake stage transitions. In humans and other mammals, the state transitions defining sleep and wake bout durations have been described with exponential and power law models, respectively. However, sleep-wake stage distributions are often complex, and distinguishing between exponential and power law processes is not always straightforward. Although mono-exponential distributions are distinct from power law distributions, multi-exponential distributions may in fact resemble power laws by appearing linear on a log-log plot. To characterize the parameters that may allow these distributions to mimic one another, we systematically fitted multi-exponential-generated distributions with a power law model, and power law-generated distributions with multi-exponential models. We used the Kolmogorov-Smirnov method to investigate goodness of fit for the "incorrect" model over a range of parameters. The "zone of mimicry" of parameters that increased the risk of mistakenly accepting power law fitting resembled empiric time constants obtained in human sleep and wake bout distributions. Recognizing this uncertainty in model distinction impacts interpretation of transition dynamics (self-organizing versus probabilistic), and the generation of predictive models for clinical classification of normal and pathological sleep architecture.
NASA Astrophysics Data System (ADS)
Jimenez-Pizarro, R.; Rojas, A. M.; Pulido-Guio, A. D.
2012-12-01
The development of environmentally, socially and financially suitable greenhouse gas (GHG) mitigation portfolios requires detailed disaggregation of emissions by activity sector, preferably at the regional level. Bottom-up (BU) emission inventories are intrinsically disaggregated, but although detailed, they are frequently incomplete. Missing and erroneous activity data are rather common in emission inventories of GHG, criteria and toxic pollutants, even in developed countries. The fraction of missing and erroneous data can be rather large in developing country inventories. In addition, the cost and time for obtaining or correcting this information can be prohibitive or can delay the inventory development. This is particularly true for regional BU inventories in the developing world. Moreover, a rather common practice is to disregard or to arbitrarily impute low default activity or emission values to missing data, which typically leads to significant underestimation of the total emissions. Our investigation focuses on GHG emissions by fossil fuel combustion in industry in the Bogota Region, composed by Bogota and its adjacent, semi-rural area of influence, the Province of Cundinamarca. We found that the BU inventories for this sub-category substantially underestimate emissions when compared to top-down (TD) estimations based on sub-sector specific national fuel consumption data and regional energy intensities. Although both BU inventories have a substantial number of missing and evidently erroneous entries, i.e. information on fuel consumption per combustion unit per company, the validated energy use and emission data display clear and smooth frequency distributions, which can be adequately fitted to bimodal log-normal distributions. This is not unexpected as industrial plant sizes are typically log-normally distributed. Moreover, our statistical tests suggest that industrial sub-sectors, as classified by the International Standard Industrial Classification (ISIC), are also well represented by log-normal distributions. Using the validated data, we tested several missing data estimation procedures, including Montecarlo sampling of the real and fitted distributions, and a per ISIC estimation based on bootstrap-calculated mean values. These results will be presented and discussed in detail. Our results suggest that the accuracy of sub-sector BU emission inventories, particularly in developing regions, could be significantly improved if they are designed and carried out to be representative sub-samples (surveys) of the actual universe of emitters. A large fraction the missing data could be subsequently estimated by robust statistical procedures provided that most of the emitters were accounted by number and ISIC.
Simple display system of mechanical properties of cells and their dispersion.
Shimizu, Yuji; Kihara, Takanori; Haghparast, Seyed Mohammad Ali; Yuba, Shunsuke; Miyake, Jun
2012-01-01
The mechanical properties of cells are unique indicators of their states and functions. Though, it is difficult to recognize the degrees of mechanical properties, due to small size of the cell and broad distribution of the mechanical properties. Here, we developed a simple virtual reality system for presenting the mechanical properties of cells and their dispersion using a haptic device and a PC. This system simulates atomic force microscopy (AFM) nanoindentation experiments for floating cells in virtual environments. An operator can virtually position the AFM spherical probe over a round cell with the haptic handle on the PC monitor and feel the force interaction. The Young's modulus of mesenchymal stem cells and HEK293 cells in the floating state was measured by AFM. The distribution of the Young's modulus of these cells was broad, and the distribution complied with a log-normal pattern. To represent the mechanical properties together with the cell variance, we used log-normal distribution-dependent random number determined by the mode and variance values of the Young's modulus of these cells. The represented Young's modulus was determined for each touching event of the probe surface and the cell object, and the haptic device-generating force was calculated using a Hertz model corresponding to the indentation depth and the fixed Young's modulus value. Using this system, we can feel the mechanical properties and their dispersion in each cell type in real time. This system will help us not only recognize the degrees of mechanical properties of diverse cells but also share them with others.
Simple Display System of Mechanical Properties of Cells and Their Dispersion
Shimizu, Yuji; Kihara, Takanori; Haghparast, Seyed Mohammad Ali; Yuba, Shunsuke; Miyake, Jun
2012-01-01
The mechanical properties of cells are unique indicators of their states and functions. Though, it is difficult to recognize the degrees of mechanical properties, due to small size of the cell and broad distribution of the mechanical properties. Here, we developed a simple virtual reality system for presenting the mechanical properties of cells and their dispersion using a haptic device and a PC. This system simulates atomic force microscopy (AFM) nanoindentation experiments for floating cells in virtual environments. An operator can virtually position the AFM spherical probe over a round cell with the haptic handle on the PC monitor and feel the force interaction. The Young's modulus of mesenchymal stem cells and HEK293 cells in the floating state was measured by AFM. The distribution of the Young's modulus of these cells was broad, and the distribution complied with a log-normal pattern. To represent the mechanical properties together with the cell variance, we used log-normal distribution-dependent random number determined by the mode and variance values of the Young's modulus of these cells. The represented Young's modulus was determined for each touching event of the probe surface and the cell object, and the haptic device-generating force was calculated using a Hertz model corresponding to the indentation depth and the fixed Young's modulus value. Using this system, we can feel the mechanical properties and their dispersion in each cell type in real time. This system will help us not only recognize the degrees of mechanical properties of diverse cells but also share them with others. PMID:22479595
Potential source identification for aerosol concentrations over a site in Northwestern India
NASA Astrophysics Data System (ADS)
Payra, Swagata; Kumar, Pramod; Verma, Sunita; Prakash, Divya; Soni, Manish
2016-03-01
The collocated measurements of aerosols size distribution (ASD) and aerosol optical thickness (AOT) are analyzed simultaneously using Grimm aerosol spectrometer and MICROTOP II Sunphotometer over Jaipur, capital of Rajasthan in India. The contrast temperature characteristics during winter and summer seasons of year 2011 are investigated in the present study. The total aerosol number concentration (TANC, 0.3-20 μm) during winter season was observed higher than in summer time and it was dominated by fine aerosol number concentration (FANC < 2 μm). Particles smaller than 0.8 μm (at aerodynamic size) constitute ~ 99% of all particles in winter and ~ 90% of particles in summer season. However, particles greater than 2 μm contribute ~ 3% and ~ 0.2% in summer and winter seasons respectively. The aerosols optical thickness shows nearly similar AOT values during summer and winter but corresponding low Angstrom Exponent (AE) values during summer than winter, respectively. In this work, Potential Source Contribution Function (PSCF) analysis is applied to identify locations of sources that influenced concentrations of aerosols over study area in two different seasons. PSCF analysis shows that the dust particles from Thar Desert contribute significantly to the coarse aerosol number concentration (CANC). Higher values of the PSCF in north from Jaipur showed the industrial areas in northern India to be the likely sources of fine particles. The variation in size distribution of aerosols during two seasons is clearly reflected in the log normal size distribution curves. The log normal size distribution curves reveals that the particle size less than 0.8 μm is the key contributor in winter for higher ANC.
Parameter estimation and forecasting for multiplicative log-normal cascades
NASA Astrophysics Data System (ADS)
Leövey, Andrés E.; Lux, Thomas
2012-04-01
We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing [Physica DPDNPDT0167-278910.1016/0167-2789(90)90035-N 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica DPDNPDT0167-278910.1016/j.physd.2004.01.020 193, 195 (2004)] and Kiyono [Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.76.041113 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono 's procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.
Smooth quantile normalization.
Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N; Quackenbush, John; Irizarry, Rafael A; Bravo, Héctor Corrada
2018-04-01
Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.
Posterior propriety for hierarchical models with log-likelihoods that have norm bounds
Michalak, Sarah E.; Morris, Carl N.
2015-07-17
Statisticians often use improper priors to express ignorance or to provide good frequency properties, requiring that posterior propriety be verified. Our paper addresses generalized linear mixed models, GLMMs, when Level I parameters have Normal distributions, with many commonly-used hyperpriors. It provides easy-to-verify sufficient posterior propriety conditions based on dimensions, matrix ranks, and exponentiated norm bounds, ENBs, for the Level I likelihood. Since many familiar likelihoods have ENBs, which is often verifiable via log-concavity and MLE finiteness, our novel use of ENBs permits unification of posterior propriety results and posterior MGF/moment results for many useful Level I distributions, including those commonlymore » used with multilevel generalized linear models, e.g., GLMMs and hierarchical generalized linear models, HGLMs. Furthermore, those who need to verify existence of posterior distributions or of posterior MGFs/moments for a multilevel generalized linear model given a proper or improper multivariate F prior as in Section 1 should find the required results in Sections 1 and 2 and Theorem 3 (GLMMs), Theorem 4 (HGLMs), or Theorem 5 (posterior MGFs/moments).« less
Measuring subhalo mass in redMaPPer clusters with CFHT Stripe 82 Survey
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ran; Shan, Huanyuan; Kneib, Jean -Paul
Here, we use the shear catalogue from the CFHT Stripe-82 Survey to measure the subhalo masses of satellite galaxies in redMaPPer clusters. Assuming a Chabrier initial mass function and a truncated NFW model for the subhalo mass distribution, we find that the subhalo mass to galaxy stellar mass ratio increases as a function of projected halo-centric radius r p, from M sub/M star = 4.43 +6.63 –2.23 at r p ε [0.1, 0.3] h –1 Mpc to M sub/M star = 75.40 +19.73 –19.09 at r p ε [0.6, 0.9] h –1 Mpc. We also investigate the dependence of subhalomore » masses on stellar mass by splitting satellite galaxies into two stellar mass bins: 10 < log (M star/h –1M ⊙) < 10.5 and 11 < log (M star/h –1 M ⊙) < 12. The best-fitting subhalo mass of the more massive satellite galaxy bin is larger than that of the less massive satellites: log(M sub/h –1M ⊙) = 11.14 +0.66 –0.73 (M sub/M star = 19.5 +19.8 –17.9) versus log(M sub/h –1M ⊙) = 12.38 +0.16 –0.16 (M sub/M star = 21.1 +7.4 –7.7).« less
Measuring subhalo mass in redMaPPer clusters with CFHT Stripe 82 Survey
Li, Ran; Shan, Huanyuan; Kneib, Jean -Paul; ...
2016-03-07
Here, we use the shear catalogue from the CFHT Stripe-82 Survey to measure the subhalo masses of satellite galaxies in redMaPPer clusters. Assuming a Chabrier initial mass function and a truncated NFW model for the subhalo mass distribution, we find that the subhalo mass to galaxy stellar mass ratio increases as a function of projected halo-centric radius r p, from M sub/M star = 4.43 +6.63 –2.23 at r p ε [0.1, 0.3] h –1 Mpc to M sub/M star = 75.40 +19.73 –19.09 at r p ε [0.6, 0.9] h –1 Mpc. We also investigate the dependence of subhalomore » masses on stellar mass by splitting satellite galaxies into two stellar mass bins: 10 < log (M star/h –1M ⊙) < 10.5 and 11 < log (M star/h –1 M ⊙) < 12. The best-fitting subhalo mass of the more massive satellite galaxy bin is larger than that of the less massive satellites: log(M sub/h –1M ⊙) = 11.14 +0.66 –0.73 (M sub/M star = 19.5 +19.8 –17.9) versus log(M sub/h –1M ⊙) = 12.38 +0.16 –0.16 (M sub/M star = 21.1 +7.4 –7.7).« less
Weibull mixture regression for marginal inference in zero-heavy continuous outcomes.
Gebregziabher, Mulugeta; Voronca, Delia; Teklehaimanot, Abeba; Santa Ana, Elizabeth J
2017-06-01
Continuous outcomes with preponderance of zero values are ubiquitous in data that arise from biomedical studies, for example studies of addictive disorders. This is known to lead to violation of standard assumptions in parametric inference and enhances the risk of misleading conclusions unless managed properly. Two-part models are commonly used to deal with this problem. However, standard two-part models have limitations with respect to obtaining parameter estimates that have marginal interpretation of covariate effects which are important in many biomedical applications. Recently marginalized two-part models are proposed but their development is limited to log-normal and log-skew-normal distributions. Thus, in this paper, we propose a finite mixture approach, with Weibull mixture regression as a special case, to deal with the problem. We use extensive simulation study to assess the performance of the proposed model in finite samples and to make comparisons with other family of models via statistical information and mean squared error criteria. We demonstrate its application on real data from a randomized controlled trial of addictive disorders. Our results show that a two-component Weibull mixture model is preferred for modeling zero-heavy continuous data when the non-zero part are simulated from Weibull or similar distributions such as Gamma or truncated Gauss.
Rescaled earthquake recurrence time statistics: application to microrepeaters
NASA Astrophysics Data System (ADS)
Goltz, Christian; Turcotte, Donald L.; Abaimov, Sergey G.; Nadeau, Robert M.; Uchida, Naoki; Matsuzawa, Toru
2009-01-01
Slip on major faults primarily occurs during `characteristic' earthquakes. The recurrence statistics of characteristic earthquakes play an important role in seismic hazard assessment. A major problem in determining applicable statistics is the short sequences of characteristic earthquakes that are available worldwide. In this paper, we introduce a rescaling technique in which sequences can be superimposed to establish larger numbers of data points. We consider the Weibull and log-normal distributions, in both cases we rescale the data using means and standard deviations. We test our approach utilizing sequences of microrepeaters, micro-earthquakes which recur in the same location on a fault. It seems plausible to regard these earthquakes as a miniature version of the classic characteristic earthquakes. Microrepeaters are much more frequent than major earthquakes, leading to longer sequences for analysis. In this paper, we present results for the analysis of recurrence times for several microrepeater sequences from Parkfield, CA as well as NE Japan. We find that, once the respective sequence can be considered to be of sufficient stationarity, the statistics can be well fitted by either a Weibull or a log-normal distribution. We clearly demonstrate this fact by our technique of rescaled combination. We conclude that the recurrence statistics of the microrepeater sequences we consider are similar to the recurrence statistics of characteristic earthquakes on major faults.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lagerloef, Jakob H.; Kindblom, Jon; Bernhardt, Peter
Purpose: Formation of new blood vessels (angiogenesis) in response to hypoxia is a fundamental event in the process of tumor growth and metastatic dissemination. However, abnormalities in tumor neovasculature often induce increased interstitial pressure (IP) and further reduce oxygenation (pO{sub 2}) of tumor cells. In radiotherapy, well-oxygenated tumors favor treatment. Antiangiogenic drugs may lower IP in the tumor, improving perfusion, pO{sub 2} and drug uptake, by reducing the number of malfunctioning vessels in the tissue. This study aims to create a model for quantifying the effects of altered pO{sub 2}-distribution due to antiangiogenic treatment in combination with radionuclide therapy. Methods:more » Based on experimental data, describing the effects of antiangiogenic agents on oxygenation of GlioblastomaMultiforme (GBM), a single cell based 3D model, including 10{sup 10} tumor cells, was developed, showing how radionuclide therapy response improves as tumor oxygenation approaches normal tissue levels. The nuclides studied were {sup 90}Y, {sup 131}I, {sup 177}Lu, and {sup 211}At. The absorbed dose levels required for a tumor control probability (TCP) of 0.990 are compared for three different log-normal pO{sub 2}-distributions: {mu}{sub 1} = 2.483, {sigma}{sub 1} = 0.711; {mu}{sub 2} = 2.946, {sigma}{sub 2} = 0.689; {mu}{sub 3} = 3.689, and {sigma}{sub 3} = 0.330. The normal tissue absorbed doses will, in turn, depend on this. These distributions were chosen to represent the expected oxygen levels in an untreated hypoxic tumor, a hypoxic tumor treated with an anti-VEGF agent, and in normal, fully-oxygenated tissue, respectively. The former two are fitted to experimental data. The geometric oxygen distributions are simulated using two different patterns: one Monte Carlo based and one radially increasing, while keeping the log-normal volumetric distributions intact. Oxygen and activity are distributed, according to the same pattern. Results: As tumor pO{sub 2} approaches normal tissue levels, the therapeutic effect is improved so that the normal tissue absorbed doses can be decreased by more than 95%, while retaining TCP, in the most favorable scenario and by up to about 80% with oxygen levels previously achieved in vivo, when the least favourable oxygenation case is used as starting point. The major difference occurs in poorly oxygenated cells. This is also where the pO{sub 2}-dependence of the oxygen enhancement ratio is maximal. Conclusions: Improved tumor oxygenation together with increased radionuclide uptake show great potential for optimising treatment strategies, leaving room for successive treatments, or lowering absorbed dose to normal tissues, due to increased tumor response. Further studies of the concomitant use of antiangiogenic drugs and radionuclide therapy therefore appear merited.« less
The soft X-ray diffuse background observed with the HEAO 1 low-energy detectors
NASA Technical Reports Server (NTRS)
Garmire, G. P.; Nousek, J. A.; Apparao, K. M. V.; Burrows, D. N.; Fink, R. L.; Kraft, R. P.
1992-01-01
Results of a study of the diffuse soft-X-ray background as observed by the low-energy detectors of the A-2 experiment aboard the HEAO 1 satellite are reported. The observed sky intensities are presented as maps of the diffuse X-ray background sky in several energy bands covering the energy range 0.15-2.8 keV. It is found that the soft X-ray diffuse background (SXDB) between 1.5 and 2.8 keV, assuming a power law form with photon number index 1.4, has a normalization constant of 10.5 +/- 1.0 photons/sq cm s sr keV. Below 1.5 keV the spectrum of the SXDB exceeds the extrapolation of this power law. The low-energy excess for the NEP can be fitted with emission from a two-temperature equilibrium plasma model with the temperatures given by log I1 = 6.16 and log T2 = 6.33. It is found that this model is able to account for the spectrum below 1 keV, but fails to yield the observed Galactic latitude variation.
Fitting and Testing Conditional Multinormal Partial Credit Models
ERIC Educational Resources Information Center
Hessen, David J.
2012-01-01
A multinormal partial credit model for factor analysis of polytomously scored items with ordered response categories is derived using an extension of the Dutch Identity (Holland in "Psychometrika" 55:5-18, 1990). In the model, latent variables are assumed to have a multivariate normal distribution conditional on unweighted sums of item…
On Correlations, Distances and Error Rates.
ERIC Educational Resources Information Center
Dorans, Neil J.
The nature of the criterion (dependent) variable may play a useful role in structuring a list of classification/prediction problems. Such criteria are continuous in nature, binary dichotomous, or multichotomous. In this paper, discussion is limited to the continuous normally distributed criterion scenarios. For both cases, it is assumed that the…
Marko, Nicholas F.; Weil, Robert J.
2012-01-01
Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863
Crowther, Michael J; Look, Maxime P; Riley, Richard D
2014-09-28
Multilevel mixed effects survival models are used in the analysis of clustered survival data, such as repeated events, multicenter clinical trials, and individual participant data (IPD) meta-analyses, to investigate heterogeneity in baseline risk and covariate effects. In this paper, we extend parametric frailty models including the exponential, Weibull and Gompertz proportional hazards (PH) models and the log logistic, log normal, and generalized gamma accelerated failure time models to allow any number of normally distributed random effects. Furthermore, we extend the flexible parametric survival model of Royston and Parmar, modeled on the log-cumulative hazard scale using restricted cubic splines, to include random effects while also allowing for non-PH (time-dependent effects). Maximum likelihood is used to estimate the models utilizing adaptive or nonadaptive Gauss-Hermite quadrature. The methods are evaluated through simulation studies representing clinically plausible scenarios of a multicenter trial and IPD meta-analysis, showing good performance of the estimation method. The flexible parametric mixed effects model is illustrated using a dataset of patients with kidney disease and repeated times to infection and an IPD meta-analysis of prognostic factor studies in patients with breast cancer. User-friendly Stata software is provided to implement the methods. Copyright © 2014 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Liu, Yanxiao; Xiang, Yongyuan; Erdélyi, Robertus; Liu, Zhong; Li, Dong; Ning, Zongjun; Bi, Yi; Wu, Ning; Lin, Jun
2018-03-01
Properties of photospheric bright points (BPs) near an active region have been studied in TiO λ 7058 Å images observed by the New Vacuum Solar Telescope of the Yunnan Observatories. We developed a novel recognition method that was used to identify and track 2010 BPs. The observed evolving BPs are classified into isolated (individual) and non-isolated (where multiple BPs are observed to display splitting and merging behaviors) sets. About 35.1% of BPs are non-isolated. For both isolated and non-isolated BPs, the brightness varies from 0.8 to 1.3 times the average background intensity and follows a Gaussian distribution. The lifetimes of BPs follow a log-normal distribution, with characteristic lifetimes of (267 ± 140) s and (421 ± 255) s, respectively. Their size also follows log-normal distribution, with an average size of about (2.15 ± 0.74) × 104 km2 and (3.00 ± 1.31) × 104 km2 for area, and (163 ± 27) km and (191 ± 40) km for diameter, respectively. Our results indicate that regions with strong background magnetic field have higher BP number density and higher BP area coverage than regions with weak background field. Apparently, the brightness/size of BPs does not depend on the background field. Lifetimes in regions with strong background magnetic field are shorter than those in regions with weak background field, on average.
Statistical Analysis of Hubble /WFC3 Transit Spectroscopy of Extrasolar Planets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fu, Guangwei; Deming, Drake; Knutson, Heather
2017-10-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST /WFC3 transit spectra for 1.1–1.65 μ m water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude ofmore » the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.« less
NASA Astrophysics Data System (ADS)
Moschandreas, D. J.; Kim, Y.; Karuchit, S.; Ari, H.; Lebowitz, M. D.; O'Rourke, M. K.; Gordon, S.; Robertson, G.
One of the objectives of the National Human Exposure Assessment Survey (NHEXAS) is to estimate exposures to several pollutants in multiple media and determine their distributions for the population of Arizona. This paper presents modeling methods used to estimate exposure distributions of chlorpyrifos and diazinon in the residential microenvironment using the database generated in Arizona (NHEXAS-AZ). A four-stage probability sampling design was used for sample selection. Exposures to pesticides were estimated using the indirect method of exposure calculation by combining measured concentrations of the two pesticides in multiple media with questionnaire information such as time subjects spent indoors, dietary and non-dietary items they consumed, and areas they touched. Most distributions of in-residence exposure to chlorpyrifos and diazinon were log-normal or nearly log-normal. Exposures to chlorpyrifos and diazinon vary by pesticide and route as well as by various demographic characteristics of the subjects. Comparisons of exposure to pesticides were investigated among subgroups of demographic categories, including gender, age, minority status, education, family income, household dwelling type, year the dwelling was built, pesticide use, and carpeted areas within dwellings. Residents with large carpeted areas within their dwellings have higher exposures to both pesticides for all routes than those in less carpet-covered areas. Depending on the route, several other determinants of exposure to pesticides were identified, but a clear pattern could not be established regarding the exposure differences between several subpopulation groups.
Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets
NASA Astrophysics Data System (ADS)
Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan
2018-01-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 micron water vapor absorption, and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-Transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
Statistical Analysis of Hubble/WFC3 Transit Spectroscopy of Extrasolar Planets
NASA Astrophysics Data System (ADS)
Fu, Guangwei; Deming, Drake; Knutson, Heather; Madhusudhan, Nikku; Mandell, Avi; Fraine, Jonathan
2017-10-01
Transmission spectroscopy provides a window to study exoplanetary atmospheres, but that window is fogged by clouds and hazes. Clouds and haze introduce a degeneracy between the strength of gaseous absorption features and planetary physical parameters such as abundances. One way to break that degeneracy is via statistical studies. We collect all published HST/WFC3 transit spectra for 1.1-1.65 μm water vapor absorption and perform a statistical study on potential correlations between the water absorption feature and planetary parameters. We fit the observed spectra with a template calculated for each planet using the Exo-transmit code. We express the magnitude of the water absorption in scale heights, thereby removing the known dependence on temperature, surface gravity, and mean molecular weight. We find that the absorption in scale heights has a positive baseline correlation with planetary equilibrium temperature; our hypothesis is that decreasing cloud condensation with increasing temperature is responsible for this baseline slope. However, the observed sample is also intrinsically degenerate in the sense that equilibrium temperature correlates with planetary mass. We compile the distribution of absorption in scale heights, and we find that this distribution is closer to log-normal than Gaussian. However, we also find that the distribution of equilibrium temperatures for the observed planets is similarly log-normal. This indicates that the absorption values are affected by observational bias, whereby observers have not yet targeted a sufficient sample of the hottest planets.
On the intrinsic shape of the gamma-ray spectrum for Fermi blazars
NASA Astrophysics Data System (ADS)
Kang, Shi-Ju; Wu, Qingwen; Zheng, Yong-Gang; Yin, Yue; Song, Jia-Li; Zou, Hang; Feng, Jian-Chao; Dong, Ai-Jun; Wu, Zhong-Zu; Zhang, Zhi-Bin; Wu, Lin-Hui
2018-05-01
The curvature of the γ-ray spectrumin blazarsmay reflect the intrinsic distribution of emitting electrons, which will further give some information on the possible acceleration and cooling processes in the emitting region. The γ-ray spectra of Fermi blazars are normally fitted either by a single power-law (PL) or a log-normal (call Logarithmic Parabola, LP) form. The possible reason for this difference is not clear. We statistically explore this issue based on the different observational properties of 1419 Fermi blazars in the 3LAC Clean Sample.We find that the γ-ray flux (100MeV–100GeV) and variability index follow bimodal distributions for PL and LP blazars, where the γ-ray flux and variability index show a positive correlation. However, the distributions of γ-ray luminosity and redshift follow a unimodal distribution. Our results suggest that the bimodal distribution of γ-ray fluxes for LP and PL blazars may not be intrinsic and all blazars may have an intrinsically curved γ-ray spectrum, and the PL spectrum is just caused by the fitting effect due to less photons.
Empirical behavior of a world stock index from intra-day to monthly time scales
NASA Astrophysics Data System (ADS)
Breymann, W.; Lüthi, D. R.; Platen, E.
2009-10-01
Most of the papers that study the distributional and fractal properties of financial instruments focus on stock prices or foreign exchange rates. This typically leads to mixed results concerning the distributions of log-returns and some multi-fractal properties of exchange rates, stock prices, and regional indices. This paper uses a well diversified world stock index as the central object of analysis. Such index approximates the growth optimal portfolio, which is demonstrated under the benchmark approach, it is the ideal reference unit for studying basic securities. When denominating this world index in units of a given currency, one measures the movements of the currency against the entire market. This provides a least disturbed observation of the currency dynamics. In this manner, one can expect to disentangle, e.g., the superposition of the two currencies involved in an exchange rate. This benchmark approach to the empirical analysis of financial data allows us to establish remarkable stylized facts. Most important is the observation that the repeatedly documented multi-fractal appearance of financial time series is very weak and much less pronounced than the deviation of the mono-scaling properties from Brownian-motion type scaling. The generalized Hurst exponent H(2) assumes typical values between 0.55 and 0.6. Accordingly, autocorrelations of log-returns decay according to a power law, and the quadratic variation vanishes when going to vanishing observation time step size. Furthermore, one can identify the Student t distribution as the log-return distribution of a well-diversified world stock index for long time horizons when a long enough data series is used for estimation. The study of dependence properties, finally, reveals that jumps at daily horizon originate primarily in the stock market while at 5min horizon they originate in the foreign exchange market. The principal message of the empirical analysis is that there is evidence that a diffusion model without multi-scaling could reasonably well model the dynamics of a broadly diversified world stock index. in here
On the constancy of the lunar cratering flux over the past 3.3 billion yr
NASA Technical Reports Server (NTRS)
Guinness, E. A.; Arvidson, R. E.
1977-01-01
Utilizing a method that minimizes random fluctuations in sampling crater populations, it can be shown that the ejecta deposit of Tycho, the floor of Copernicus, and the region surrounding the Apollo 12 landing site have incremental crater size-frequency distributions that can be expressed as log-log linear functions over the diameter range from 0.1 to 1 km. Slopes are indistinguishable for the three populations, probably indicating that the surfaces are dominated by primary craters. Treating the crater populations of Tycho, the floor of Copernicus, and Apollo 12 as primary crater populations contaminated, but not overwhelmed, with secondaries, allows an attempt at calibration of the post-heavy bombardment cratering flux. Using the age of Tycho as 109 m.y., Copernicus as 800 m.y., and Apollo 12 as 3.26 billion yr, there is no basis for assuming that the flux has changed over the past 3.3 billion yr. This result can be used for dating intermediate aged surfaces by crater density.
Economic weights of somatic cell score in dairy sheep.
Legarra, A; Ramón, M; Ugarte, E; Pérez-Guzmán, M D; Arranz, J
2007-03-01
The economic weights for somatic cell score (SCS) have been calculated using profit functions. Economic data were collected in the Latxa breed. Three aspects have been considered: bulk tank milk payment, veterinary treatments due to high SCS, and culling. All of them are non-linear profit functions. Milk payment is based on the sum of the log-normal distributions of somatic cell count, and veterinary treatments on the probability of subclinical mastitis, which is inferred when individual SCS surpass some threshold. Both functions lead to non-standard distributions. The derivatives of the profit function were computed numerically. Culling was computed by assuming that a conceptual trait culled by mastitis (CBM) is genetically correlated to SCS. The economic weight considers the increase in the breeding value of CBM correlated to an increase in the breeding value of SCS, assuming genetic correlations ranging from 0 to 0.9. The relevance of the economic weights for selection purposes was checked by the estimation of genetic gains for milk yield and SCS under several scenarios of genetic parameters and economic weights. The overall economic weights for SCS range from - 2.6 to - 9.5 € per point of SCS, with an average of - 4 € per point of SCS, depending on the expected average SCS of the flock. The economic weight is higher around the thresholds for payment policies. Economic weights did not change greatly with other assumptions. The estimated genetic gains with economic weights of 0.83 € per l of milk yield and - 4 € per point of SCS, assuming a genetic correlation of - 0.30, were 3.85 l and - 0.031 SCS per year, with an associated increase in profit of 3.32 €. This represents a very small increase in profit (about 1%) relative to selecting only for milk yield. Other situations (increased economic weights, different genetic correlations) produced similar genetic gains and changes in profit. A desired-gains index reduced the increase in profit by 3%, although it could be greater depending on the genetic parameters. It is concluded that the inclusion of SCS in dairy sheep breeding programs is of low economic relevance and recommended only if recording is inexpensive or for animal welfare concerns.
A random effects meta-analysis model with Box-Cox transformation.
Yamaguchi, Yusuke; Maruo, Kazushi; Partlett, Christopher; Riley, Richard D
2017-07-19
In a random effects meta-analysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval. We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects meta-analysis model where a Box-Cox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the within-study sampling distributions and the random effects distribution. When sampling distributions are approximately normal, non-normality in the overall distribution will be mainly due to the random effects distribution, especially when the between-study variation is large relative to the within-study variation. The Box-Cox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the meta-analysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable. A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I 2 from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model. The random effects meta-analysis with the Box-Cox transformation may be an important tool for examining robustness of traditional meta-analysis results against skewness on the observed treatment effect estimates. Further critical evaluation of the method is needed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dalcanton, Julianne J.; Fouesneau, Morgan; Weisz, Daniel R.
We map the distribution of dust in M31 at 25 pc resolution using stellar photometry from the Panchromatic Hubble Andromeda Treasury survey. The map is derived with a new technique that models the near-infrared color–magnitude diagram (CMD) of red giant branch (RGB) stars. The model CMDs combine an unreddened foreground of RGB stars with a reddened background population viewed through a log-normal column density distribution of dust. Fits to the model constrain the median extinction, the width of the extinction distribution, and the fraction of reddened stars in each 25 pc cell. The resulting extinction map has a factor ofmore » ≳4 times better resolution than maps of dust emission, while providing a more direct measurement of the dust column. There is superb morphological agreement between the new map and maps of the extinction inferred from dust emission by Draine et al. However, the widely used Draine and Li dust models overpredict the observed extinction by a factor of ∼2.5, suggesting that M31's true dust mass is lower and that dust grains are significantly more emissive than assumed in Draine et al. The observed factor of ∼2.5 discrepancy is consistent with similar findings in the Milky Way by the Plank Collaboration et al., but we find a more complex dependence on parameters from the Draine and Li dust models. We also show that the the discrepancy with the Draine et al. map is lowest where the current interstellar radiation field has a harder spectrum than average. We discuss possible improvements to the CMD dust mapping technique, and explore further applications in both M31 and other galaxies.« less
Scaling laws and properties of compositional data
NASA Astrophysics Data System (ADS)
Buccianti, Antonella; Albanese, Stefano; Lima, AnnaMaria; Minolfi, Giulia; De Vivo, Benedetto
2016-04-01
Many random processes occur in geochemistry. Accurate predictions of the manner in which elements or chemical species interact each other are needed to construct models able to treat presence of random components. Geochemical variables actually observed are the consequence of several events, some of which may be poorly defined or imperfectly understood. Variables tend to change with time/space but, despite their complexity, may share specific common traits and it is possible to model them stochastically. Description of the frequency distribution of the geochemical abundances has been an important target of research, attracting attention for at least 100 years, starting with CLARKE (1889) and continued by GOLDSCHMIDT (1933) and WEDEPOHL (1955). However, it was AHRENS (1954a,b) who focussed on the effect of skewness distributions, for example the log-normal distribution, regarded by him as a fundamental law of geochemistry. Although modeling of frequency distributions with some probabilistic models (for example Gaussian, log-normal, Pareto) has been well discussed in several fields of application, little attention has been devoted to the features of compositional data. When compositional nature of data is taken into account, the most typical distribution models for compositions are the Dirichlet and the additive logistic normal (or normal on the simplex) (AITCHISON et al. 2003; MATEU-FIGUERAS et al. 2005; MATEU-FIGUERAS and PAWLOWSKY-GLAHN 2008; MATEU-FIGUERAS et al. 2013). As an alternative, because compositional data have to be transformed from simplex space to real space, coordinates obtained by the ilr transformation or by application of the concept of balance can be analyzed by classical methods (EGOZCUE et al. 2003). In this contribution an approach coherent with the properties of compositional information is proposed and used to investigate the shape of the frequency distribution of compositional data. The purpose is to understand data-generation processes from the perspective of compositional theory. The approach is based on the use of the isometric log-ratio transformation, characterized by theoretical and practical advantages, but requiring a more complex geochemical interpretation compared with the investigation of single variables. The proposed methodology directs attention to model the frequency distributions of more complex indices, linking all the terms of the composition to better represent the dynamics of geochemical processes. An example of its application is presented and discussed by considering topsoil geochemistry of Campania Region (southern Italy). The investigated multi-element data archive contains, among others, Al, As, B, Ba, Ca, Co, Cr, Cu, Fe, K, La, Mg, Mn, Mo, Na, Ni, P, Pb, Sr, Th, Ti, V and Zn (mg/kg) contents determined in 3535 new topsoils as well as information on coordinates, geology, land cover. (BUCCIANTI et al., 2015). AHRENS, L. ,1954a. Geochim. Cosm. Acta 6, 121-131. AHRENS, L., 1954b. Geochim. Cosm. Acta 5, 49-73. AITCHISON, J., et al., 2003. Math Geol 35(6), 667-680. BUCCIANTI et al., 2015. Jour. Geoch. Explor., 159, 302-316. CLARKE, F., 1889. Phil. Society of Washington Bull. 11, 131-142. EGOZCUE, J.J. et al., 2003. Math Geol 35(3), 279-300. MATEU-FIGUERAS, G. et al, (2005), Stoch. Environ. Res. Risk Ass. 19(3), 205-214.
Garcia, Tanya P; Ma, Yanyuan
2017-10-01
We develop consistent and efficient estimation of parameters in general regression models with mismeasured covariates. We assume the model error and covariate distributions are unspecified, and the measurement error distribution is a general parametric distribution with unknown variance-covariance. We construct root- n consistent, asymptotically normal and locally efficient estimators using the semiparametric efficient score. We do not estimate any unknown distribution or model error heteroskedasticity. Instead, we form the estimator under possibly incorrect working distribution models for the model error, error-prone covariate, or both. Empirical results demonstrate robustness to different incorrect working models in homoscedastic and heteroskedastic models with error-prone covariates.
Lee, Myung W.
1999-01-01
Methods of predicting acoustic logs from resistivity logs for hydrate-bearing sediments are presented. Modified time average equations derived from the weighted equation provide a means of relating the velocity of the sediment to the resistivity of the sediment. These methods can be used to transform resistivity logs into acoustic logs with or without using the gas hydrate concentration in the pore space. All the parameters except the unconsolidation constants, necessary for the prediction of acoustic log from resistivity log, can be estimated from a cross plot of resistivity versus porosity values. Unconsolidation constants in equations may be assumed without rendering significant errors in the prediction. These methods were applied to the acoustic and resistivity logs acquired at the Mallik 2L-38 gas hydrate research well drilled at the Mackenzie Delta, northern Canada. The results indicate that the proposed method is simple and accurate.
Measuring firm size distribution with semi-nonparametric densities
NASA Astrophysics Data System (ADS)
Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier
2017-11-01
In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.
Football goal distributions and extremal statistics
NASA Astrophysics Data System (ADS)
Greenhough, J.; Birch, P. C.; Chapman, S. C.; Rowlands, G.
2002-12-01
We analyse the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001. The probability density functions (PDFs) of goals scored are too heavy-tailed to be fitted over their entire ranges by Poisson or negative binomial distributions which would be expected for uncorrelated processes. Log-normal distributions cannot include zero scores and here we find that the PDFs are consistent with those arising from extremal statistics. In addition, we show that it is sufficient to model English top division and FA Cup matches in the seasons of 1970/71-2000/01 on Poisson or negative binomial distributions, as reported in analyses of earlier seasons, and that these are not consistent with extremal statistics.
Algae Tile Data: 2004-2007, BPA-51; Preliminary Report, October 28, 2008.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holderman, Charles
Multiple files containing 2004 through 2007 Tile Chlorophyll data for the Kootenai River sites designated as: KR1, KR2, KR3, KR4 (Downriver) and KR6, KR7, KR9, KR9.1, KR10, KR11, KR12, KR13, KR14 (Upriver) were received by SCS. For a complete description of the sites covered, please refer to http://ktoi.scsnetw.com. To maintain consistency with the previous SCS algae reports, all analyses were carried out separately for the Upriver and Downriver categories, as defined in the aforementioned paragraph. The Upriver designation, however, now includes three additional sites, KR11, KR12, and the nutrient addition site, KR9.1. Summary statistics and information on the four responses,more » chlorophyll a, chlorophyll a Accrual Rate, Total Chlorophyll, and Total Chlorophyll Accrual Rate are presented in Print Out 2. Computations were carried out separately for each river position (Upriver and Downriver) and year. For example, the Downriver position in 2004 showed an average Chlorophyll a level of 25.5 mg with a standard deviation of 21.4 and minimum and maximum values of 3.1 and 196 mg, respectively. The Upriver data in 2004 showed a lower overall average chlorophyll a level at 2.23 mg with a lower standard deviation (3.6) and minimum and maximum values of (0.13 and 28.7, respectively). A more comprehensive summary of each variable and position is given in Print Out 3. This lists the information above as well as other summary information such as the variance, standard error, various percentiles and extreme values. Using the 2004 Downriver Chlorophyll a as an example again, the variance of this data was 459.3 and the standard error of the mean was 1.55. The median value or 50th percentile was 21.3, meaning 50% of the data fell above and below this value. It should be noted that this value is somewhat different than the mean of 25.5. This is an indication that the frequency distribution of the data is not symmetrical (skewed). The skewness statistic, listed as part of the first section of each analysis, quantifies this. In a symmetric distribution, such as a Normal distribution, the skewness value would be 0. The tile chlorophyll data, however, shows larger values. Chlorophyll a, in the 2004 Downriver example, has a skewness statistic of 3.54, which is quite high. In the last section of the summary analysis, the stem and leaf plot graphically demonstrates the asymmetry, showing most of the data centered around 25 with a large value at 196. The final plot is referred to as a normal probability plot and graphically compares the data to a theoretical normal distribution. For chlorophyll a, the data (asterisks) deviate substantially from the theoretical normal distribution (diagonal reference line of pluses), indicating that the data is non-normal. Other response variables in both the Downriver and Upriver categories also indicated skewed distributions. Because the sample size and mean comparison procedures below require symmetrical, normally distributed data, each response in the data set was logarithmically transformed. The logarithmic transformation, in this case, can help mitigate skewness problems. The summary statistics for the four transformed responses (log-ChlorA, log-TotChlor, and log-accrual ) are given in Print Out 4. For the 2004 Downriver Chlorophyll a data, the logarithmic transformation reduced the skewness value to -0.36 and produced a more bell-shaped symmetric frequency distribution. Similar improvements are shown for the remaining variables and river categories. Hence, all subsequent analyses given below are based on logarithmic transformations of the original responses.« less
Normality of raw data in general linear models: The most widespread myth in statistics
Kery, Marc; Hatfield, Jeff S.
2003-01-01
In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.
Critiquing ';pore connectivity' as basis for in situ flow in geothermal systems
NASA Astrophysics Data System (ADS)
Kenedi, C. L.; Leary, P.; Malin, P.
2013-12-01
Geothermal system in situ flow systematics derived from detailed examination of grain-scale structures, fabrics, mineral alteration, and pore connectivity may be extremely misleading if/when extrapolated to reservoir-scale flow structure. In oil/gas field clastic reservoir operations, it is standard to assume that small scale studies of flow fabric - notably the Kozeny-Carman and Archie's Law treatments at the grain-scale and well-log/well-bore sampling of formations/reservoirs at the cm-m scale - are adequate to define the reservoir-scale flow properties. In the case of clastic reservoirs, however, a wide range of reservoir-scale data wholly discredits this extrapolation: Well-log data show that grain-scale fracture density fluctuation power scales inversely with spatial frequency k, S(k) ~ 1/k^β, 1.0 < β < 1.2, 1cycle/km < k < 1cycle/cm; the scaling is a ';universal' feature of well-logs (neutron porosity, sonic velocity, chemical abundance, mass density, resistivity, in many forms of clastic rock and instances of shale bodies, for both horizontal and vertical wells). Grain-scale fracture density correlates with in situ porosity; spatial fluctuations of porosity φ in well-core correlate with spatial fluctuations in the logarithm of well-core permeability, δφ ~ δlog(κ) with typical correlation coefficient ~ 85%; a similar relation is observed in consolidating sediments/clays, indicating a generic coupling between fluid pressure and solid deformation at pore sites. In situ macroscopic flow systems are lognormally distributed according to κ ~ κ0 exp(α(φ-φ0)), α >>1 an empirical parameter for degree of in situ fracture connectivity; the lognormal distribution applies to well-productivities in US oil fields and NZ geothermal fields, ';frack productivity' in oil/gas shale body reservoirs, ore grade distributions, and trace element abundances. Although presently available evidence for these properties in geothermal reservoirs is limited, there are indications that geothermal system flow essentially obeys the same ';universal' in situ flow rules as does clastic rock: Well-log data from Los Azufres, MX, show power-law scaling S(k) ~ 1/k^β, 1.2 < β < 1.4, for spatial frequency range 2cycles/km to 0.5cycle/m; higher β-values are likely due to the relatively fresh nature of geothermal systems; Well-core at Bulalo (PH) and Ohaaki (NZ) show statistically significant spatial correlation, δφ ~ δlog(κ) Well productivity at Ohaaki/Ngawha (NZ) and in geothermal systems elsewhere are lognormally distributed; K/Th/U abundances lognormally distributed in Los Azufres well-logs We therefore caution that small-scale evidence for in situ flow fabric in geothermal systems that is interpreted in terms of ';pore connectivity' may in fact not reflect how small-scale chemical processes are integrated into a large-scale geothermal flow structure. Rather such small scale studies should (perhaps) be considered in term of the above flow rules. These flow rules are easily incorporated into standard flow simulation codes, in particular the OPM = Open Porous Media open-source industry-standard flow code. Geochemical transport data relevant to geothermal systems can thus be expected to be well modeled by OPM or equivalent (e.g., INL/LANL) codes.
heterogeneous mixture distributions for multi-source extreme rainfall
NASA Astrophysics Data System (ADS)
Ouarda, T.; Shin, J.; Lee, T. S.
2013-12-01
Mixture distributions have been used to model hydro-meteorological variables showing mixture distributional characteristics, e.g. bimodality. Homogeneous mixture (HOM) distributions (e.g. Normal-Normal and Gumbel-Gumbel) have been traditionally applied to hydro-meteorological variables. However, there is no reason to restrict the mixture distribution as the combination of one identical type. It might be beneficial to characterize the statistical behavior of hydro-meteorological variables from the application of heterogeneous mixture (HTM) distributions such as Normal-Gamma. In the present work, we focus on assessing the suitability of HTM distributions for the frequency analysis of hydro-meteorological variables. In the present work, in order to estimate the parameters of HTM distributions, the meta-heuristic algorithm (Genetic Algorithm) is employed to maximize the likelihood function. In the present study, a number of distributions are compared, including the Gamma-Extreme value type-one (EV1) HTM distribution, the EV1-EV1 HOM distribution, and EV1 distribution. The proposed distribution models are applied to the annual maximum precipitation data in South Korea. The Akaike Information Criterion (AIC), the root mean squared errors (RMSE) and the log-likelihood are used as measures of goodness-of-fit of the tested distributions. Results indicate that the HTM distribution (Gamma-EV1) presents the best fitness. The HTM distribution shows significant improvement in the estimation of quantiles corresponding to the 20-year return period. It is shown that extreme rainfall in the coastal region of South Korea presents strong heterogeneous mixture distributional characteristics. Results indicate that HTM distributions are a good alternative for the frequency analysis of hydro-meteorological variables when disparate statistical characteristics are presented.
NASA Astrophysics Data System (ADS)
Lee, Wen-Chuan; Wu, Jong-Wuu; Tsou, Hsin-Hui; Lei, Chia-Ling
2012-10-01
This article considers that the number of defective units in an arrival order is a binominal random variable. We derive a modified mixture inventory model with backorders and lost sales, in which the order quantity and lead time are decision variables. In our studies, we also assume that the backorder rate is dependent on the length of lead time through the amount of shortages and let the backorder rate be a control variable. In addition, we assume that the lead time demand follows a mixture of normal distributions, and then relax the assumption about the form of the mixture of distribution functions of the lead time demand and apply the minimax distribution free procedure to solve the problem. Furthermore, we develop an algorithm procedure to obtain the optimal ordering strategy for each case. Finally, three numerical examples are also given to illustrate the results.
Bivariate sub-Gaussian model for stock index returns
NASA Astrophysics Data System (ADS)
Jabłońska-Sabuka, Matylda; Teuerle, Marek; Wyłomańska, Agnieszka
2017-11-01
Financial time series are commonly modeled with methods assuming data normality. However, the real distribution can be nontrivial, also not having an explicitly formulated probability density function. In this work we introduce novel parameter estimation and high-powered distribution testing methods which do not rely on closed form densities, but use the characteristic functions for comparison. The approach applied to a pair of stock index returns demonstrates that such a bivariate vector can be a sample coming from a bivariate sub-Gaussian distribution. The methods presented here can be applied to any nontrivially distributed financial data, among others.
Minimizing bias in biomass allometry: Model selection and log transformation of data
Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer
2011-01-01
Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....
NASA Astrophysics Data System (ADS)
Berthet, Lionel; Marty, Renaud; Bourgin, François; Viatgé, Julie; Piotte, Olivier; Perrin, Charles
2017-04-01
An increasing number of operational flood forecasting centres assess the predictive uncertainty associated with their forecasts and communicate it to the end users. This information can match the end-users needs (i.e. prove to be useful for an efficient crisis management) only if it is reliable: reliability is therefore a key quality for operational flood forecasts. In 2015, the French flood forecasting national and regional services (Vigicrues network; www.vigicrues.gouv.fr) implemented a framework to compute quantitative discharge and water level forecasts and to assess the predictive uncertainty. Among the possible technical options to achieve this goal, a statistical analysis of past forecasting errors of deterministic models has been selected (QUOIQUE method, Bourgin, 2014). It is a data-based and non-parametric approach based on as few assumptions as possible about the forecasting error mathematical structure. In particular, a very simple assumption is made regarding the predictive uncertainty distributions for large events outside the range of the calibration data: the multiplicative error distribution is assumed to be constant, whatever the magnitude of the flood. Indeed, the predictive distributions may not be reliable in extrapolation. However, estimating the predictive uncertainty for these rare events is crucial when major floods are of concern. In order to improve the forecasts reliability for major floods, an attempt at combining the operational strength of the empirical statistical analysis and a simple error modelling is done. Since the heteroscedasticity of forecast errors can considerably weaken the predictive reliability for large floods, this error modelling is based on the log-sinh transformation which proved to reduce significantly the heteroscedasticity of the transformed error in a simulation context, even for flood peaks (Wang et al., 2012). Exploratory tests on some operational forecasts issued during the recent floods experienced in France (major spring floods in June 2016 on the Loire river tributaries and flash floods in fall 2016) will be shown and discussed. References Bourgin, F. (2014). How to assess the predictive uncertainty in hydrological modelling? An exploratory work on a large sample of watersheds, AgroParisTech Wang, Q. J., Shrestha, D. L., Robertson, D. E. and Pokhrel, P (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, , W05514, doi:10.1029/2011WR010973
The probability distribution model of air pollution index and its dominants in Kuala Lumpur
NASA Astrophysics Data System (ADS)
AL-Dhurafi, Nasr Ahmed; Razali, Ahmad Mahir; Masseran, Nurulkamal; Zamzuri, Zamira Hasanah
2016-11-01
This paper focuses on the statistical modeling for the distributions of air pollution index (API) and its sub-indexes data observed at Kuala Lumpur in Malaysia. Five pollutants or sub-indexes are measured including, carbon monoxide (CO); sulphur dioxide (SO2); nitrogen dioxide (NO2), and; particulate matter (PM10). Four probability distributions are considered, namely log-normal, exponential, Gamma and Weibull in search for the best fit distribution to the Malaysian air pollutants data. In order to determine the best distribution for describing the air pollutants data, five goodness-of-fit criteria's are applied. This will help in minimizing the uncertainty in pollution resource estimates and improving the assessment phase of planning. The conflict in criterion results for selecting the best distribution was overcome by using the weight of ranks method. We found that the Gamma distribution is the best distribution for the majority of air pollutants data in Kuala Lumpur.
NASA Astrophysics Data System (ADS)
Pedretti, Daniele; Masetti, Marco; Beretta, Giovanni Pietro
2017-10-01
The expected long-term efficiency of vertical cutoff walls coupled to pump-and-treat technologies to contain solute plumes in highly heterogeneous aquifers was analyzed. A well-characterized case study in Italy, with a hydrogeological database of 471 results from hydraulic tests performed on the aquifer and the surrounding 2-km-long cement-bentonite (CB) walls, was used to build a conceptual model and assess a representative remediation site adopting coupled technologies. In the studied area, the aquifer hydraulic conductivity Ka [m/d] is log-normally distributed with mean E (Ya) = 0.32 , variance σYa2 = 6.36 (Ya = lnKa) and spatial correlation well described by an exponential isotropic variogram with integral scale less than 1/12 the domain size. The hardened CB wall's hydraulic conductivity, Kw [m/d], displayed strong scaling effects and a lognormal distribution with mean E (Yw) = - 3.43 and σYw2 = 0.53 (Yw =log10Kw). No spatial correlation of Kw was detected. Using this information, conservative transport was simulated across a CB wall in spatially correlated 1-D random Ya fields within a numerical Monte Carlo framework. Multiple scenarios representing different Kw values were tested. A continuous solute source with known concentration and deterministic drains' discharge rates were assumed. The efficiency of the confining system was measured by the probability of exceedance of concentration over a threshold (C∗) at a control section 10 years after the initial solute release. It was found that the stronger the aquifer heterogeneity, the higher the expected efficiency of the confinement system and the lower the likelihood of aquifer pollution. This behavior can be explained because, for the analyzed aquifer conditions, a lower Ka generates more pronounced drawdown in the water table in the proximity of the drain and consequently a higher advective flux towards the confined area, which counteracts diffusive fluxes across the walls. Thus, a higher σYa2 results in a larger amount of low Ka values in the proximity of the drain, and a higher probability of not exceeding C∗ .
Confirmatory Factor Analysis of Ordinal Variables with Misspecified Models
ERIC Educational Resources Information Center
Yang-Wallentin, Fan; Joreskog, Karl G.; Luo, Hao
2010-01-01
Ordinal variables are common in many empirical investigations in the social and behavioral sciences. Researchers often apply the maximum likelihood method to fit structural equation models to ordinal data. This assumes that the observed measures have normal distributions, which is not the case when the variables are ordinal. A better approach is…
Three New Methods for Analysis of Answer Changes
ERIC Educational Resources Information Center
Sinharay, Sandip; Johnson, Matthew S.
2017-01-01
In a pioneering research article, Wollack and colleagues suggested the "erasure detection index" (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction,…
Outliers: A Potential Data Problem.
ERIC Educational Resources Information Center
Douzenis, Cordelia; Rakow, Ernest A.
Outliers, extreme data values relative to others in a sample, may distort statistics that assume internal levels of measurement and normal distribution. The outlier may be a valid value or an error. Several procedures are available for identifying outliers, and each may be applied to errors of prediction from the regression lines for utility in a…
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
ERIC Educational Resources Information Center
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
Hydrocarbon potential of pre-Pennsylvanian rocks in Roosevelt County, New Mexico
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pitt, W.D.
The hydrocarbon potential of pre-Pennsylvanian rocks in Roosevelt County was appraised from data available in published reports, scout tickets, lithology logs, and other well data at the log libraries in Roswell and Socorro, New Mexico, and Midland, Texas. Elevations from lithology logs were used when differing from scout tickets or other sources. Thickness and data other than lithology logs were assumed to be sufficiently accurate if they fitted the control obtained by contouring. The lithology and reservoir potential of the systems of rock that subcrop beneath the Pennsylvanian System in Roosevelt County are summarized.
Bengtsson, Henrik; Hössjer, Ola
2006-03-01
Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.
Workload Characterization and Performance Implications of Large-Scale Blog Servers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeon, Myeongjae; Kim, Youngjae; Hwang, Jeaho
With the ever-increasing popularity of social network services (SNSs), an understanding of the characteristics of these services and their effects on the behavior of their host servers is critical. However, there has been a lack of research on the workload characterization of servers running SNS applications such as blog services. To fill this void, we empirically characterized real-world web server logs collected from one of the largest South Korean blog hosting sites for 12 consecutive days. The logs consist of more than 96 million HTTP requests and 4.7 TB of network traffic. Our analysis reveals the followings: (i) The transfermore » size of non-multimedia files and blog articles can be modeled using a truncated Pareto distribution and a log-normal distribution, respectively; (ii) User access for blog articles does not show temporal locality, but is strongly biased towards those posted with image or audio files. We additionally discuss the potential performance improvement through clustering of small files on a blog page into contiguous disk blocks, which benefits from the observed file access patterns. Trace-driven simulations show that, on average, the suggested approach achieves 60.6% better system throughput and reduces the processing time for file access by 30.8% compared to the best performance of the Ext4 file system.« less
IMPLEMENTING A NOVEL CYCLIC CO2 FLOOD IN PALEOZOIC REEFS
DOE Office of Scientific and Technical Information (OSTI.GOV)
James R. Wood; W. Quinlan; A. Wylie
2003-07-01
Recycled CO2 will be used in this demonstration project to produce bypassed oil from the Silurian Charlton 6 pinnacle reef (Otsego County) in the Michigan Basin. Contract negotiations by our industry partner to gain access to this CO2 that would otherwise be vented to the atmosphere are near completion. A new method of subsurface characterization, log curve amplitude slicing, is being used to map facies distributions and reservoir properties in two reefs, the Belle River Mills and Chester 18 Fields. The Belle River Mills and Chester18 fields are being used as typefields because they have excellent log-curve and core datamore » coverage. Amplitude slicing of the normalized gamma ray curves is showing trends that may indicate significant heterogeneity and compartmentalization in these reservoirs. Digital and hard copy data continues to be compiled for the Niagaran reefs in the Michigan Basin. Technology transfer took place through technical presentations regarding the log curve amplitude slicing technique and a booth at the Midwest PTTC meeting.« less
Sakashita, Tetsuya; Hamada, Nobuyuki; Kawaguchi, Isao; Hara, Takamitsu; Kobayashi, Yasuhiko; Saito, Kimiaki
2014-05-01
A single cell can form a colony, and ionizing irradiation has long been known to reduce such a cellular clonogenic potential. Analysis of abortive colonies unable to continue to grow should provide important information on the reproductive cell death (RCD) following irradiation. Our previous analysis with a branching process model showed that the RCD in normal human fibroblasts can persist over 16 generations following irradiation with low linear energy transfer (LET) γ-rays. Here we further set out to evaluate the RCD persistency in abortive colonies arising from normal human fibroblasts exposed to high-LET carbon ions (18.3 MeV/u, 108 keV/µm). We found that the abortive colony size distribution determined by biological experiments follows a linear relationship on the log-log plot, and that the Monte Carlo simulation using the RCD probability estimated from such a linear relationship well simulates the experimentally determined surviving fraction and the relative biological effectiveness (RBE). We identified the short-term phase and long-term phase for the persistent RCD following carbon-ion irradiation, which were similar to those previously identified following γ-irradiation. Taken together, our results suggest that subsequent secondary or tertiary colony formation would be invaluable for understanding the long-lasting RCD. All together, our framework for analysis with a branching process model and a colony formation assay is applicable to determination of cellular responses to low- and high-LET radiation, and suggests that the long-lasting RCD is a pivotal determinant of the surviving fraction and the RBE.
Daily magnesium intake and serum magnesium concentration among Japanese people.
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. The mean (+/-standard deviation) daily magnesium intake was 322 (+/-132), 323 (+/-163), and 322 (+/-147) mg/day for men, women, and the entire group, respectively. The mean (+/-standard deviation) serum magnesium concentration was 20.69 (+/-2.83), 20.69 (+/-2.88), and 20.69 (+/-2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log(10)X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed.
A Bayesian Hybrid Adaptive Randomisation Design for Clinical Trials with Survival Outcomes.
Moatti, M; Chevret, S; Zohar, S; Rosenberger, W F
2016-01-01
Response-adaptive randomisation designs have been proposed to improve the efficiency of phase III randomised clinical trials and improve the outcomes of the clinical trial population. In the setting of failure time outcomes, Zhang and Rosenberger (2007) developed a response-adaptive randomisation approach that targets an optimal allocation, based on a fixed sample size. The aim of this research is to propose a response-adaptive randomisation procedure for survival trials with an interim monitoring plan, based on the following optimal criterion: for fixed variance of the estimated log hazard ratio, what allocation minimizes the expected hazard of failure? We demonstrate the utility of the design by redesigning a clinical trial on multiple myeloma. To handle continuous monitoring of data, we propose a Bayesian response-adaptive randomisation procedure, where the log hazard ratio is the effect measure of interest. Combining the prior with the normal likelihood, the mean posterior estimate of the log hazard ratio allows derivation of the optimal target allocation. We perform a simulation study to assess and compare the performance of this proposed Bayesian hybrid adaptive design to those of fixed, sequential or adaptive - either frequentist or fully Bayesian - designs. Non informative normal priors of the log hazard ratio were used, as well as mixture of enthusiastic and skeptical priors. Stopping rules based on the posterior distribution of the log hazard ratio were computed. The method is then illustrated by redesigning a phase III randomised clinical trial of chemotherapy in patients with multiple myeloma, with mixture of normal priors elicited from experts. As expected, there was a reduction in the proportion of observed deaths in the adaptive vs. non-adaptive designs; this reduction was maximized using a Bayes mixture prior, with no clear-cut improvement by using a fully Bayesian procedure. The use of stopping rules allows a slight decrease in the observed proportion of deaths under the alternate hypothesis compared with the adaptive designs with no stopping rules. Such Bayesian hybrid adaptive survival trials may be promising alternatives to traditional designs, reducing the duration of survival trials, as well as optimizing the ethical concerns for patients enrolled in the trial.
Analytical approximations for effective relative permeability in the capillary limit
NASA Astrophysics Data System (ADS)
Rabinovich, Avinoam; Li, Boxiao; Durlofsky, Louis J.
2016-10-01
We present an analytical method for calculating two-phase effective relative permeability, krjeff, where j designates phase (here CO2 and water), under steady state and capillary-limit assumptions. These effective relative permeabilities may be applied in experimental settings and for upscaling in the context of numerical flow simulations, e.g., for CO2 storage. An exact solution for effective absolute permeability, keff, in two-dimensional log-normally distributed isotropic permeability (k) fields is the geometric mean. We show that this does not hold for krjeff since log normality is not maintained in the capillary-limit phase permeability field (Kj=k·krj) when capillary pressure, and thus the saturation field, is varied. Nevertheless, the geometric mean is still shown to be suitable for approximating krjeff when the variance of lnk is low. For high-variance cases, we apply a correction to the geometric average gas effective relative permeability using a Winsorized mean, which neglects large and small Kj values symmetrically. The analytical method is extended to anisotropically correlated log-normal permeability fields using power law averaging. In these cases, the Winsorized mean treatment is applied to the gas curves for cases described by negative power law exponents (flow across incomplete layers). The accuracy of our analytical expressions for krjeff is demonstrated through extensive numerical tests, using low-variance and high-variance permeability realizations with a range of correlation structures. We also present integral expressions for geometric-mean and power law average krjeff for the systems considered, which enable derivation of closed-form series solutions for krjeff without generating permeability realizations.
Chenglin, L.; Charpentier, R.R.
2010-01-01
The U.S. Geological Survey procedure for the estimation of the general form of the parent distribution requires that the parameters of the log-geometric distribution be calculated and analyzed for the sensitivity of these parameters to different conditions. In this study, we derive the shape factor of a log-geometric distribution from the ratio of frequencies between adjacent bins. The shape factor has a log straight-line relationship with the ratio of frequencies. Additionally, the calculation equations of a ratio of the mean size to the lower size-class boundary are deduced. For a specific log-geometric distribution, we find that the ratio of the mean size to the lower size-class boundary is the same. We apply our analysis to simulations based on oil and gas pool distributions from four petroleum systems of Alberta, Canada and four generated distributions. Each petroleum system in Alberta has a different shape factor. Generally, the shape factors in the four petroleum systems stabilize with the increase of discovered pool numbers. For a log-geometric distribution, the shape factor becomes stable when discovered pool numbers exceed 50 and the shape factor is influenced by the exploration efficiency when the exploration efficiency is less than 1. The simulation results show that calculated shape factors increase with those of the parent distributions, and undiscovered oil and gas resources estimated through the log-geometric distribution extrapolation are smaller than the actual values. ?? 2010 International Association for Mathematical Geology.
Kennedy, Paula L; Woodbury, Allan D
2002-01-01
In ground water flow and transport modeling, the heterogeneous nature of porous media has a considerable effect on the resulting flow and solute transport. Some method of generating the heterogeneous field from a limited dataset of uncertain measurements is required. Bayesian updating is one method that interpolates from an uncertain dataset using the statistics of the underlying probability distribution function. In this paper, Bayesian updating was used to determine the heterogeneous natural log transmissivity field for a carbonate and a sandstone aquifer in southern Manitoba. It was determined that the transmissivity in m2/sec followed a natural log normal distribution for both aquifers with a mean of -7.2 and - 8.0 for the carbonate and sandstone aquifers, respectively. The variograms were calculated using an estimator developed by Li and Lake (1994). Fractal nature was not evident in the variogram from either aquifer. The Bayesian updating heterogeneous field provided good results even in cases where little data was available. A large transmissivity zone in the sandstone aquifer was created by the Bayesian procedure, which is not a reflection of any deterministic consideration, but is a natural outcome of updating a prior probability distribution function with observations. The statistical model returns a result that is very reasonable; that is homogeneous in regions where little or no information is available to alter an initial state. No long range correlation trends or fractal behavior of the log-transmissivity field was observed in either aquifer over a distance of about 300 km.
Linear energy transfer incorporated intensity modulated proton therapy optimization
NASA Astrophysics Data System (ADS)
Cao, Wenhua; Khabazian, Azin; Yepes, Pablo P.; Lim, Gino; Poenisch, Falk; Grosshans, David R.; Mohan, Radhe
2018-01-01
The purpose of this study was to investigate the feasibility of incorporating linear energy transfer (LET) into the optimization of intensity modulated proton therapy (IMPT) plans. Because increased LET correlates with increased biological effectiveness of protons, high LETs in target volumes and low LETs in critical structures and normal tissues are preferred in an IMPT plan. However, if not explicitly incorporated into the optimization criteria, different IMPT plans may yield similar physical dose distributions but greatly different LET, specifically dose-averaged LET, distributions. Conventionally, the IMPT optimization criteria (or cost function) only includes dose-based objectives in which the relative biological effectiveness (RBE) is assumed to have a constant value of 1.1. In this study, we added LET-based objectives for maximizing LET in target volumes and minimizing LET in critical structures and normal tissues. Due to the fractional programming nature of the resulting model, we used a variable reformulation approach so that the optimization process is computationally equivalent to conventional IMPT optimization. In this study, five brain tumor patients who had been treated with proton therapy at our institution were selected. Two plans were created for each patient based on the proposed LET-incorporated optimization (LETOpt) and the conventional dose-based optimization (DoseOpt). The optimized plans were compared in terms of both dose (assuming a constant RBE of 1.1 as adopted in clinical practice) and LET. Both optimization approaches were able to generate comparable dose distributions. The LET-incorporated optimization achieved not only pronounced reduction of LET values in critical organs, such as brainstem and optic chiasm, but also increased LET in target volumes, compared to the conventional dose-based optimization. However, on occasion, there was a need to tradeoff the acceptability of dose and LET distributions. Our conclusion is that the inclusion of LET-dependent criteria in the IMPT optimization could lead to similar dose distributions as the conventional optimization but superior LET distributions in target volumes and normal tissues. This may have substantial advantages in improving tumor control and reducing normal tissue toxicities.
The price momentum of stock in distribution
NASA Astrophysics Data System (ADS)
Liu, Haijun; Wang, Longfei
2018-02-01
In this paper, a new momentum of stock in distribution is proposed and applied in real investment. Firstly, assuming that a stock behaves as a multi-particle system, its share-exchange distribution and cost distribution are introduced. Secondly, an estimation of the share-exchange distribution is given with daily transaction data by 3 σ rule from the normal distribution. Meanwhile, an iterative method is given to estimate the cost distribution. Based on the cost distribution, a new momentum is proposed for stock system. Thirdly, an empirical test is given to compare the new momentum with others by contrarian strategy. The result shows that the new one outperforms others in many places. Furthermore, entropy of stock is introduced according to its cost distribution.
Parametric vs. non-parametric statistics of low resolution electromagnetic tomography (LORETA).
Thatcher, R W; North, D; Biver, C
2005-01-01
This study compared the relative statistical sensitivity of non-parametric and parametric statistics of 3-dimensional current sources as estimated by the EEG inverse solution Low Resolution Electromagnetic Tomography (LORETA). One would expect approximately 5% false positives (classification of a normal as abnormal) at the P < .025 level of probability (two tailed test) and approximately 1% false positives at the P < .005 level. EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) from 43 normal adult subjects were imported into the Key Institute's LORETA program. We then used the Key Institute's cross-spectrum and the Key Institute's LORETA output files (*.lor) as the 2,394 gray matter pixel representation of 3-dimensional currents at different frequencies. The mean and standard deviation *.lor files were computed for each of the 2,394 gray matter pixels for each of the 43 subjects. Tests of Gaussianity and different transforms were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of parametric vs. non-parametric statistics were compared using a "leave-one-out" cross validation method in which individual normal subjects were withdrawn and then statistically classified as being either normal or abnormal based on the remaining subjects. Log10 transforms approximated Gaussian distribution in the range of 95% to 99% accuracy. Parametric Z score tests at P < .05 cross-validation demonstrated an average misclassification rate of approximately 4.25%, and range over the 2,394 gray matter pixels was 27.66% to 0.11%. At P < .01 parametric Z score cross-validation false positives were 0.26% and ranged from 6.65% to 0% false positives. The non-parametric Key Institute's t-max statistic at P < .05 had an average misclassification error rate of 7.64% and ranged from 43.37% to 0.04% false positives. The nonparametric t-max at P < .01 had an average misclassification rate of 6.67% and ranged from 41.34% to 0% false positives of the 2,394 gray matter pixels for any cross-validated normal subject. In conclusion, adequate approximation to Gaussian distribution and high cross-validation can be achieved by the Key Institute's LORETA programs by using a log10 transform and parametric statistics, and parametric normative comparisons had lower false positive rates than the non-parametric tests.
The Italian primary school-size distribution and the city-size: a complex nexus
NASA Astrophysics Data System (ADS)
Belmonte, Alessandro; di Clemente, Riccardo; Buldyrev, Sergey V.
2014-06-01
We characterize the statistical law according to which Italian primary school-size distributes. We find that the school-size can be approximated by a log-normal distribution, with a fat lower tail that collects a large number of very small schools. The upper tail of the school-size distribution decreases exponentially and the growth rates are distributed with a Laplace PDF. These distributions are similar to those observed for firms and are consistent with a Bose-Einstein preferential attachment process. The body of the distribution features a bimodal shape suggesting some source of heterogeneity in the school organization that we uncover by an in-depth analysis of the relation between schools-size and city-size. We propose a novel cluster methodology and a new spatial interaction approach among schools which outline the variety of policies implemented in Italy. Different regional policies are also discussed shedding lights on the relation between policy and geographical features.
Likelihood-based confidence intervals for estimating floods with given return periods
NASA Astrophysics Data System (ADS)
Martins, Eduardo Sávio P. R.; Clarke, Robin T.
1993-06-01
This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.
2012-01-01
Background The goals of our study are to determine the most appropriate model for alcohol consumption as an exposure for burden of disease, to analyze the effect of the chosen alcohol consumption distribution on the estimation of the alcohol Population- Attributable Fractions (PAFs), and to characterize the chosen alcohol consumption distribution by exploring if there is a global relationship within the distribution. Methods To identify the best model, the Log-Normal, Gamma, and Weibull prevalence distributions were examined using data from 41 surveys from Gender, Alcohol and Culture: An International Study (GENACIS) and from the European Comparative Alcohol Study. To assess the effect of these distributions on the estimated alcohol PAFs, we calculated the alcohol PAF for diabetes, breast cancer, and pancreatitis using the three above-named distributions and using the more traditional approach based on categories. The relationship between the mean and the standard deviation from the Gamma distribution was estimated using data from 851 datasets for 66 countries from GENACIS and from the STEPwise approach to Surveillance from the World Health Organization. Results The Log-Normal distribution provided a poor fit for the survey data, with Gamma and Weibull distributions providing better fits. Additionally, our analyses showed that there were no marked differences for the alcohol PAF estimates based on the Gamma or Weibull distributions compared to PAFs based on categorical alcohol consumption estimates. The standard deviation of the alcohol distribution was highly dependent on the mean, with a unit increase in alcohol consumption associated with a unit increase in the mean of 1.258 (95% CI: 1.223 to 1.293) (R2 = 0.9207) for women and 1.171 (95% CI: 1.144 to 1.197) (R2 = 0. 9474) for men. Conclusions Although the Gamma distribution and the Weibull distribution provided similar results, the Gamma distribution is recommended to model alcohol consumption from population surveys due to its fit, flexibility, and the ease with which it can be modified. The results showed that a large degree of variance of the standard deviation of the alcohol consumption Gamma distribution was explained by the mean alcohol consumption, allowing for alcohol consumption to be modeled through a Gamma distribution using only average consumption. PMID:22490226
Gallagher, Daniel; Ebel, Eric D; Gallagher, Owen; Labarre, David; Williams, Michael S; Golden, Neal J; Pouillot, Régis; Dearfield, Kerry L; Kause, Janell
2013-04-01
This report illustrates how the uncertainty about food safety metrics may influence the selection of a performance objective (PO). To accomplish this goal, we developed a model concerning Listeria monocytogenes in ready-to-eat (RTE) deli meats. This application used a second order Monte Carlo model that simulates L. monocytogenes concentrations through a series of steps: the food-processing establishment, transport, retail, the consumer's home and consumption. The model accounted for growth inhibitor use, retail cross contamination, and applied an FAO/WHO dose response model for evaluating the probability of illness. An appropriate level of protection (ALOP) risk metric was selected as the average risk of illness per serving across all consumed servings-per-annum and the model was used to solve for the corresponding performance objective (PO) risk metric as the maximum allowable L. monocytogenes concentration (cfu/g) at the processing establishment where regulatory monitoring would occur. Given uncertainty about model inputs, an uncertainty distribution of the PO was estimated. Additionally, we considered how RTE deli meats contaminated at levels above the PO would be handled by the industry using three alternative approaches. Points on the PO distribution represent the probability that - if the industry complies with a particular PO - the resulting risk-per-serving is less than or equal to the target ALOP. For example, assuming (1) a target ALOP of -6.41 log10 risk of illness per serving, (2) industry concentrations above the PO that are re-distributed throughout the remaining concentration distribution and (3) no dose response uncertainty, establishment PO's of -4.98 and -4.39 log10 cfu/g would be required for 90% and 75% confidence that the target ALOP is met, respectively. The PO concentrations from this example scenario are more stringent than the current typical monitoring level of an absence in 25 g (i.e., -1.40 log10 cfu/g) or a stricter criteria of absence in 125 g (i.e., -2.1 log10 cfu/g). This example, and others, demonstrates that a PO for L. monocytogenes would be far below any current monitoring capabilities. Furthermore, this work highlights the demands placed on risk managers and risk assessors when applying uncertain risk models to the current risk metric framework. Copyright © 2013 Elsevier B.V. All rights reserved.
Murase, Kenya; Konishi, Takashi; Takeuchi, Yuki; Takata, Hiroshige; Saito, Shigeyoshi
2013-07-01
Our purpose in this study was to investigate the behavior of signal harmonics in magnetic particle imaging (MPI) by experimental and simulation studies. In the experimental studies, we made an apparatus for MPI in which both a drive magnetic field (DMF) and a selection magnetic field (SMF) were generated with a Maxwell coil pair. The MPI signals from magnetic nanoparticles (MNPs) were detected with a solenoid coil. The odd- and even-numbered harmonics were calculated by Fourier transformation with or without background subtraction. The particle size of the MNPs was measured by transmission electron microscopy (TEM), dynamic light-scattering, and X-ray diffraction methods. In the simulation studies, the magnetization and particle size distribution of MNPs were assumed to obey the Langevin theory of paramagnetism and a log-normal distribution, respectively. The odd- and even-numbered harmonics were calculated by Fourier transformation under various conditions of DMF and SMF and for three different particle sizes. The behavior of the harmonics largely depended on the size of the MNPs. When we used the particle size obtained from the TEM image, the simulation results were most similar to the experimental results. The similarity between the experimental and simulation results for the even-numbered harmonics was better than that for the odd-numbered harmonics. This was considered to be due to the fact that the odd-numbered harmonics were more sensitive to background subtraction than were the even-numbered harmonics. This study will be useful for a better understanding, optimization, and development of MPI and for designing MNPs appropriate for MPI.
Daily estimates of soil ingestion in children.
Stanek, E J; Calabrese, E J
1995-01-01
Soil ingestion estimates play an important role in risk assessment of contaminated sites, and estimates of soil ingestion in children are of special interest. Current estimates of soil ingestion are trace-element specific and vary widely among elements. Although expressed as daily estimates, the actual estimates have been constructed by averaging soil ingestion over a study period of several days. The wide variability has resulted in uncertainty as to which method of estimation of soil ingestion is best. We developed a methodology for calculating a single estimate of soil ingestion for each subject for each day. Because the daily soil ingestion estimate represents the median estimate of eligible daily trace-element-specific soil ingestion estimates for each child, this median estimate is not trace-element specific. Summary estimates for individuals and weeks are calculated using these daily estimates. Using this methodology, the median daily soil ingestion estimate for 64 children participating in the 1989 Amherst soil ingestion study is 13 mg/day or less for 50% of the children and 138 mg/day or less for 95% of the children. Mean soil ingestion estimates (for up to an 8-day period) were 45 mg/day or less for 50% of the children, whereas 95% of the children reported a mean soil ingestion of 208 mg/day or less. Daily soil ingestion estimates were used subsequently to estimate the mean and variance in soil ingestion for each child and to extrapolate a soil ingestion distribution over a year, assuming that soil ingestion followed a log-normal distribution. Images Figure 1. Figure 2. Figure 3. Figure 4. PMID:7768230
Makedonska, Nataliia; Hyman, Jeffrey D.; Karra, Satish; ...
2016-08-01
The apertures of natural fractures in fractured rock are highly heterogeneous. However, in-fracture aperture variability is often neglected in flow and transport modeling and individual fractures are assumed to have uniform aperture distribution. The relative importance of in-fracture variability in flow and transport modeling within kilometer-scale fracture networks has been under debate for a long time, since the flow in each single fracture is controlled not only by in-fracture variability but also by boundary conditions. Computational limitations have previously prohibited researchers from investigating the relative importance of in-fracture variability in flow and transport modeling within large-scale fracture networks. We addressmore » this question by incorporating internal heterogeneity of individual fractures into flow simulations within kilometer scale three-dimensional fracture networks, where fracture intensity, P 32 (ratio between total fracture area and domain volume) is between 0.027 and 0.031 [1/m]. The recently developed discrete fracture network (DFN) simulation capability, dfnWorks, is used to generate kilometer scale DFNs that include in-fracture aperture variability represented by a stationary log-normal stochastic field with various correlation lengths and variances. The Lagrangian transport parameters, non-reacting travel time, , and cumulative retention, , are calculated along particles streamlines. As a result, it is observed that due to local flow channeling early particle travel times are more sensitive to in-fracture aperture variability than the tails of travel time distributions, where no significant effect of the in-fracture aperture variations and spatial correlation length is observed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Makedonska, Nataliia; Hyman, Jeffrey D.; Karra, Satish
The apertures of natural fractures in fractured rock are highly heterogeneous. However, in-fracture aperture variability is often neglected in flow and transport modeling and individual fractures are assumed to have uniform aperture distribution. The relative importance of in-fracture variability in flow and transport modeling within kilometer-scale fracture networks has been under debate for a long time, since the flow in each single fracture is controlled not only by in-fracture variability but also by boundary conditions. Computational limitations have previously prohibited researchers from investigating the relative importance of in-fracture variability in flow and transport modeling within large-scale fracture networks. We addressmore » this question by incorporating internal heterogeneity of individual fractures into flow simulations within kilometer scale three-dimensional fracture networks, where fracture intensity, P 32 (ratio between total fracture area and domain volume) is between 0.027 and 0.031 [1/m]. The recently developed discrete fracture network (DFN) simulation capability, dfnWorks, is used to generate kilometer scale DFNs that include in-fracture aperture variability represented by a stationary log-normal stochastic field with various correlation lengths and variances. The Lagrangian transport parameters, non-reacting travel time, , and cumulative retention, , are calculated along particles streamlines. As a result, it is observed that due to local flow channeling early particle travel times are more sensitive to in-fracture aperture variability than the tails of travel time distributions, where no significant effect of the in-fracture aperture variations and spatial correlation length is observed.« less
Summary:Background. It is widely accepted that substances that cannot penetrate through the skin will not be sensitisers. Thresholds based on relevant physicochemical parameters such as a LogKow > 1 and a MW < 500, are assumed and widely accepted as self-evident truths. Objective...
NASA Technical Reports Server (NTRS)
Podwysocki, M. H.
1974-01-01
Two study areas in a cratonic platform underlain by flat-lying sedimentary rocks were analyzed to determine if a quantitative relationship exists between fracture trace patterns and their frequency distributions and subsurface structural closures which might contain petroleum. Fracture trace lengths and frequency (number of fracture traces per unit area) were analyzed by trend surface analysis and length frequency distributions also were compared to a standard Gaussian distribution. Composite rose diagrams of fracture traces were analyzed using a multivariate analysis method which grouped or clustered the rose diagrams and their respective areas on the basis of the behavior of the rays of the rose diagram. Analysis indicates that the lengths of fracture traces are log-normally distributed according to the mapping technique used. Fracture trace frequency appeared higher on the flanks of active structures and lower around passive reef structures. Fracture trace log-mean lengths were shorter over several types of structures, perhaps due to increased fracturing and subsequent erosion. Analysis of rose diagrams using a multivariate technique indicated lithology as the primary control for the lower grouping levels. Groupings at higher levels indicated that areas overlying active structures may be isolated from their neighbors by this technique while passive structures showed no differences which could be isolated.
The Statistical Nature of Fatigue Crack Propagation
1977-03-01
LEVEL x - V AFFDL-TRt-T843 r THE STATISTICAL NATURE OF b FATIGUE CRACK PROPAGATION D. A. VIRKLER B. M. HILLBERR Y LL= P. K. GOEL C* SCHOOL...function of crack length was best represented by the three-parameter log-normal distribution. Six growth rate calculation methods were investigated and the...dN, which varied moderately as a function of crack length, replicate a vs. N data were predicted This predicted data reproduced the mean behavior but
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.
Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R
2012-08-01
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.
NASA Astrophysics Data System (ADS)
Abreu-Vicente, J.; Kainulainen, J.; Stutz, A.; Henning, Th.; Beuther, H.
2015-09-01
We present the first study of the relationship between the column density distribution of molecular clouds within nearby Galactic spiral arms and their evolutionary status as measured from their stellar content. We analyze a sample of 195 molecular clouds located at distances below 5.5 kpc, identified from the ATLASGAL 870 μm data. We define three evolutionary classes within this sample: starless clumps, star-forming clouds with associated young stellar objects, and clouds associated with H ii regions. We find that the N(H2) probability density functions (N-PDFs) of these three classes of objects are clearly different: the N-PDFs of starless clumps are narrowest and close to log-normal in shape, while star-forming clouds and H ii regions exhibit a power-law shape over a wide range of column densities and log-normal-like components only at low column densities. We use the N-PDFs to estimate the evolutionary time-scales of the three classes of objects based on a simple analytic model from literature. Finally, we show that the integral of the N-PDFs, the dense gas mass fraction, depends on the total mass of the regions as measured by ATLASGAL: more massive clouds contain greater relative amounts of dense gas across all evolutionary classes. Appendices are available in electronic form at http://www.aanda.org
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gigase, Yves
2007-07-01
Available in abstract form only. Full text of publication follows: The uncertainty on characteristics of radioactive LILW waste packages is difficult to determine and often very large. This results from a lack of knowledge of the constitution of the waste package and of the composition of the radioactive sources inside. To calculate a quantitative estimate of the uncertainty on a characteristic of a waste package one has to combine these various uncertainties. This paper discusses an approach to this problem, based on the use of the log-normal distribution, which is both elegant and easy to use. It can provide asmore » example quantitative estimates of uncertainty intervals that 'make sense'. The purpose is to develop a pragmatic approach that can be integrated into existing characterization methods. In this paper we show how our method can be applied to the scaling factor method. We also explain how it can be used when estimating other more complex characteristics such as the total uncertainty of a collection of waste packages. This method could have applications in radioactive waste management, more in particular in those decision processes where the uncertainty on the amount of activity is considered to be important such as in probability risk assessment or the definition of criteria for acceptance or categorization. (author)« less
Hallifax, D; Houston, J B
2009-03-01
Mechanistic prediction of unbound drug clearance from human hepatic microsomes and hepatocytes correlates with in vivo clearance but is both systematically low (10 - 20 % of in vivo clearance) and highly variable, based on detailed assessments of published studies. Metabolic capacity (Vmax) of commercially available human hepatic microsomes and cryopreserved hepatocytes is log-normally distributed within wide (30 - 150-fold) ranges; Km is also log-normally distributed and effectively independent of Vmax, implying considerable variability in intrinsic clearance. Despite wide overlap, average capacity is 2 - 20-fold (dependent on P450 enzyme) greater in microsomes than hepatocytes, when both are normalised (scaled to whole liver). The in vitro ranges contrast with relatively narrow ranges of clearance among clinical studies. The high in vitro variation probably reflects unresolved phenotypical variability among liver donors and practicalities in processing of human liver into in vitro systems. A significant contribution from the latter is supported by evidence of low reproducibility (several fold) of activity in cryopreserved hepatocytes and microsomes prepared from the same cells, between separate occasions of thawing of cells from the same liver. The large uncertainty which exists in human hepatic in vitro systems appears to dominate the overall uncertainty of in vitro-in vivo extrapolation, including uncertainties within scaling, modelling and drug dependent effects. As such, any notion of quantitative prediction of clearance appears severely challenged.
Evaluation of portfolio credit risk based on survival analysis for progressive censored data
NASA Astrophysics Data System (ADS)
Jaber, Jamil J.; Ismail, Noriszura; Ramli, Siti Norafidah Mohd
2017-04-01
In credit risk management, the Basel committee provides a choice of three approaches to the financial institutions for calculating the required capital: the standardized approach, the Internal Ratings-Based (IRB) approach, and the Advanced IRB approach. The IRB approach is usually preferred compared to the standard approach due to its higher accuracy and lower capital charges. This paper use several parametric models (Exponential, log-normal, Gamma, Weibull, Log-logistic, Gompertz) to evaluate the credit risk of the corporate portfolio in the Jordanian banks based on the monthly sample collected from January 2010 to December 2015. The best model is selected using several goodness-of-fit criteria (MSE, AIC, BIC). The results indicate that the Gompertz distribution is the best model parametric model for the data.
A New Bond Albedo for Performing Orbital Debris Brightness to Size Transformations
NASA Technical Reports Server (NTRS)
Mulrooney, Mark K.; Matney, Mark J.
2008-01-01
We have developed a technique for estimating the intrinsic size distribution of orbital debris objects via optical measurements alone. The process is predicated on the empirically observed power-law size distribution of debris (as indicated by radar RCS measurements) and the log-normal probability distribution of optical albedos as ascertained from phase (Lambertian) and range-corrected telescopic brightness measurements. Since the observed distribution of optical brightness is the product integral of the size distribution of the parent [debris] population with the albedo probability distribution, it is a straightforward matter to transform a given distribution of optical brightness back to a size distribution by the appropriate choice of a single albedo value. This is true because the integration of a powerlaw with a log-normal distribution (Fredholm Integral of the First Kind) yields a Gaussian-blurred power-law distribution with identical power-law exponent. Application of a single albedo to this distribution recovers a simple power-law [in size] which is linearly offset from the original distribution by a constant whose value depends on the choice of the albedo. Significantly, there exists a unique Bond albedo which, when applied to an observed brightness distribution, yields zero offset and therefore recovers the original size distribution. For physically realistic powerlaws of negative slope, the proper choice of albedo recovers the parent size distribution by compensating for the observational bias caused by the large number of small objects that appear anomalously large (bright) - and thereby skew the small population upward by rising above the detection threshold - and the lower number of large objects that appear anomalously small (dim). Based on this comprehensive analysis, a value of 0.13 should be applied to all orbital debris albedo-based brightness-to-size transformations regardless of data source. Its prima fascia genesis, derived and constructed from the current RCS to size conversion methodology (SiBAM Size-Based Estimation Model) and optical data reduction standards, assures consistency in application with the prior canonical value of 0.1. Herein we present the empirical and mathematical arguments for this approach and by example apply it to a comprehensive set of photometric data acquired via NASA's Liquid Mirror Telescopes during the 2000-2001 observing season.
TESTING THE PROPAGATING FLUCTUATIONS MODEL WITH A LONG, GLOBAL ACCRETION DISK SIMULATION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogg, J Drew; Reynolds, Christopher S.
2016-07-20
The broadband variability of many accreting systems displays characteristic structures; log-normal flux distributions, root-mean square (rms)-flux relations, and long inter-band lags. These characteristics are usually interpreted as inward propagating fluctuations of the mass accretion rate in an accretion disk driven by stochasticity of the angular momentum transport mechanism. We present the first analysis of propagating fluctuations in a long-duration, high-resolution, global three-dimensional magnetohydrodynamic (MHD) simulation of a geometrically thin ( h / r ≈ 0.1) accretion disk around a black hole. While the dynamical-timescale turbulent fluctuations in the Maxwell stresses are too rapid to drive radially coherent fluctuations in themore » accretion rate, we find that the low-frequency quasi-periodic dynamo action introduces low-frequency fluctuations in the Maxwell stresses, which then drive the propagating fluctuations. Examining both the mass accretion rate and emission proxies, we recover log-normality, linear rms-flux relations, and radial coherence that would produce inter-band lags. Hence, we successfully relate and connect the phenomenology of propagating fluctuations to modern MHD accretion disk theory.« less
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.
Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua
2017-07-01
Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.
Introducing high performance distributed logging service for ACS
NASA Astrophysics Data System (ADS)
Avarias, Jorge A.; López, Joao S.; Maureira, Cristián; Sommer, Heiko; Chiozzi, Gianluca
2010-07-01
The ALMA Common Software (ACS) is a software framework that provides the infrastructure for the Atacama Large Millimeter Array and other projects. ACS, based on CORBA, offers basic services and common design patterns for distributed software. Every properly built system needs to be able to log status and error information. Logging in a single computer scenario can be as easy as using fprintf statements. However, in a distributed system, it must provide a way to centralize all logging data in a single place without overloading the network nor complicating the applications. ACS provides a complete logging service infrastructure in which every log has an associated priority and timestamp, allowing filtering at different levels of the system (application, service and clients). Currently the ACS logging service uses an implementation of the CORBA Telecom Log Service in a customized way, using only a minimal subset of the features provided by the standard. The most relevant feature used by ACS is the ability to treat the logs as event data that gets distributed over the network in a publisher-subscriber paradigm. For this purpose the CORBA Notification Service, which is resource intensive, is used. On the other hand, the Data Distribution Service (DDS) provides an alternative standard for publisher-subscriber communication for real-time systems, offering better performance and featuring decentralized message processing. The current document describes how the new high performance logging service of ACS has been modeled and developed using DDS, replacing the Telecom Log Service. Benefits and drawbacks are analyzed. A benchmark is presented comparing the differences between the implementations.
Historical floods in flood frequency analysis: Is this game worth the candle?
NASA Astrophysics Data System (ADS)
Strupczewski, Witold G.; Kochanek, Krzysztof; Bogdanowicz, Ewa
2017-11-01
In flood frequency analysis (FFA) the profit from inclusion of historical information on the largest historical pre-instrumental floods depends primarily on reliability of the information, i.e. the accuracy of magnitude and return period of floods. This study is focused on possible theoretical maximum gain in accuracy of estimates of upper quantiles, that can be obtained by incorporating the largest historical floods of known return periods into the FFA. We assumed a simple case: N years of systematic records of annual maximum flows and either one largest (XM1) or two largest (XM1 and XM2) flood peak flows in a historical M-year long period. The problem is explored by Monte Carlo simulations with the maximum likelihood (ML) method. Both correct and false distributional assumptions are considered. In the first case the two-parameter extreme value models (Gumbel, log-Gumbel, Weibull) with various coefficients of variation serve as parent distributions. In the case of unknown parent distribution, the Weibull distribution was assumed as estimating model and the truncated Gumbel as parent distribution. The return periods of XM1 and XM2 are determined from the parent distribution. The results are then compared with the case, when return periods of XM1 and XM2 are defined by their plotting positions. The results are presented in terms of bias, root mean square error and the probability of overestimation of the quantile with 100-year return period. The results of the research indicate that the maximal profit of inclusion of pre-instrumental foods in the FFA may prove smaller than the cost of reconstruction of historical hydrological information.
Pulse height response of an optical particle counter to monodisperse aerosols
NASA Technical Reports Server (NTRS)
Wilmoth, R. G.; Grice, S. S.; Cuda, V.
1976-01-01
The pulse height response of a right angle scattering optical particle counter has been investigated using monodisperse aerosols of polystyrene latex spheres, di-octyl phthalate and methylene blue. The results confirm previous measurements for the variation of mean pulse height as a function of particle diameter and show good agreement with the relative response predicted by Mie scattering theory. Measured cumulative pulse height distributions were found to fit reasonably well to a log normal distribution with a minimum geometric standard deviation of about 1.4 for particle diameters greater than about 2 micrometers. The geometric standard deviation was found to increase significantly with decreasing particle diameter.
NASA Astrophysics Data System (ADS)
Viswanathan, G. M.; Buldyrev, S. V.; Garger, E. K.; Kashpur, V. A.; Lucena, L. S.; Shlyakhter, A.; Stanley, H. E.; Tschiersch, J.
2000-09-01
We analyze nonstationary 137Cs atmospheric activity concentration fluctuations measured near Chernobyl after the 1986 disaster and find three new results: (i) the histogram of fluctuations is well described by a log-normal distribution; (ii) there is a pronounced spectral component with period T=1yr, and (iii) the fluctuations are long-range correlated. These findings allow us to quantify two fundamental statistical properties of the data: the probability distribution and the correlation properties of the time series. We interpret our findings as evidence that the atmospheric radionuclide resuspension processes are tightly coupled to the surrounding ecosystems and to large time scale weather patterns.
Complexity of viscous dissipation in turbulent thermal convection
NASA Astrophysics Data System (ADS)
Bhattacharya, Shashwat; Pandey, Ambrish; Kumar, Abhishek; Verma, Mahendra K.
2018-03-01
Using direct numerical simulations of turbulent thermal convection for the Rayleigh number between 106 and 108 and unit Prandtl number, we derive scaling relations for viscous dissipation in the bulk and in the boundary layers. We show that contrary to the general belief, the total viscous dissipation in the bulk is larger, albeit marginally, than that in the boundary layers. The bulk dissipation rate is similar to that in hydrodynamic turbulence with log-normal distribution, but it differs from (U3/d) by a factor of Ra-0.18. Viscous dissipation in the boundary layers is rarer but more intense with a stretched-exponential distribution.
ERIC Educational Resources Information Center
DeMars, Christine E.
2012-01-01
In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and…
MCMC Sampling for a Multilevel Model with Nonindependent Residuals within and between Cluster Units
ERIC Educational Resources Information Center
Browne, William; Goldstein, Harvey
2010-01-01
In this article, we discuss the effect of removing the independence assumptions between the residuals in two-level random effect models. We first consider removing the independence between the Level 2 residuals and instead assume that the vector of all residuals at the cluster level follows a general multivariate normal distribution. We…
Optimal and Most Exact Confidence Intervals for Person Parameters in Item Response Theory Models
ERIC Educational Resources Information Center
Doebler, Anna; Doebler, Philipp; Holling, Heinz
2013-01-01
The common way to calculate confidence intervals for item response theory models is to assume that the standardized maximum likelihood estimator for the person parameter [theta] is normally distributed. However, this approximation is often inadequate for short and medium test lengths. As a result, the coverage probabilities fall below the given…
Accommodating Binary and Count Variables in Mediation: A Case for Conditional Indirect Effects
ERIC Educational Resources Information Center
Geldhof, G. John; Anthony, Katherine P.; Selig, James P.; Mendez-Luck, Carolyn A.
2018-01-01
The existence of several accessible sources has led to a proliferation of mediation models in the applied research literature. Most of these sources assume endogenous variables (e.g., M, and Y) have normally distributed residuals, precluding models of binary and/or count data. Although a growing body of literature has expanded mediation models to…
Ordinary Least Squares Estimation of Parameters in Exploratory Factor Analysis with Ordinal Data
ERIC Educational Resources Information Center
Lee, Chun-Ting; Zhang, Guangjian; Edwards, Michael C.
2012-01-01
Exploratory factor analysis (EFA) is often conducted with ordinal data (e.g., items with 5-point responses) in the social and behavioral sciences. These ordinal variables are often treated as if they were continuous in practice. An alternative strategy is to assume that a normally distributed continuous variable underlies each ordinal variable.…
A common mode of origin of power laws in models of market and earthquake
NASA Astrophysics Data System (ADS)
Bhattacharyya, Pratip; Chatterjee, Arnab; Chakrabarti, Bikas K.
2007-07-01
We show that there is a common mode of origin for the power laws observed in two different models: (i) the Pareto law for the distribution of money among the agents with random-saving propensities in an ideal gas-like market model and (ii) the Gutenberg-Richter law for the distribution of overlaps in a fractal-overlap model for earthquakes. We find that the power laws appear as the asymptotic forms of ever-widening log-normal distributions for the agents’ money and the overlap magnitude, respectively. The identification of the generic origin of the power laws helps in better understanding and in developing generalized views of phenomena in such diverse areas as economics and geophysics.
Chaos-assisted tunneling in the presence of Anderson localization.
Doggen, Elmer V H; Georgeot, Bertrand; Lemarié, Gabriel
2017-10-01
Tunneling between two classically disconnected regular regions can be strongly affected by the presence of a chaotic sea in between. This phenomenon, known as chaos-assisted tunneling, gives rise to large fluctuations of the tunneling rate. Here we study chaos-assisted tunneling in the presence of Anderson localization effects in the chaotic sea. Our results show that the standard tunneling rate distribution is strongly modified by localization, going from the Cauchy distribution in the ergodic regime to a log-normal distribution in the strongly localized case, for both a deterministic and a disordered model. We develop a single-parameter scaling description which accurately describes the numerical data. Several possible experimental implementations using cold atoms, photonic lattices, or microwave billiards are discussed.
Universal statistics of selected values
NASA Astrophysics Data System (ADS)
Smerlak, Matteo; Youssef, Ahmed
2017-03-01
Selection, the tendency of some traits to become more frequent than others under the influence of some (natural or artificial) agency, is a key component of Darwinian evolution and countless other natural and social phenomena. Yet a general theory of selection, analogous to the Fisher-Tippett-Gnedenko theory of extreme events, is lacking. Here we introduce a probabilistic definition of selection and show that selected values are attracted to a universal family of limiting distributions which generalize the log-normal distribution. The universality classes and scaling exponents are determined by the tail thickness of the random variable under selection. Our results provide a possible explanation for skewed distributions observed in diverse contexts where selection plays a key role, from molecular biology to agriculture and sport.
NASA Technical Reports Server (NTRS)
Pitts, D. E.; Badhwar, G.
1980-01-01
The development of agricultural remote sensing systems requires knowledge of agricultural field size distributions so that the sensors, sampling frames, image interpretation schemes, registration systems, and classification systems can be properly designed. Malila et al. (1976) studied the field size distribution for wheat and all other crops in two Kansas LACIE (Large Area Crop Inventory Experiment) intensive test sites using ground observations of the crops and measurements of their field areas based on current year rectified aerial photomaps. The field area and size distributions reported in the present investigation are derived from a representative subset of a stratified random sample of LACIE sample segments. In contrast to previous work, the obtained results indicate that most field-size distributions are not log-normally distributed. The most common field size observed in this study was 10 acres for most crops studied.
Log-Log Convexity of Type-Token Growth in Zipf's Systems
NASA Astrophysics Data System (ADS)
Font-Clos, Francesc; Corral, Álvaro
2015-06-01
It is traditionally assumed that Zipf's law implies the power-law growth of the number of different elements with the total number of elements in a system—the so-called Heaps' law. We show that a careful definition of Zipf's law leads to the violation of Heaps' law in random systems, with growth curves that have a convex shape in log-log scale. These curves fulfill universal data collapse that only depends on the value of Zipf's exponent. We observe that real books behave very much in the same way as random systems, despite the presence of burstiness in word occurrence. We advance an explanation for this unexpected correspondence.
Modeling gene expression measurement error: a quasi-likelihood approach
Strimmer, Korbinian
2003-01-01
Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637
Yadav, Ishwar Chandra; Devi, Ningombam Linthoingambi; Li, Jun; Zhang, Gan
2017-10-01
Regardless of the ban on the polychlorinated biphenyls (PCBs) decade ago, significant measures of PCBs are still transmitted from essential sources in cities and are all inclusive ecological contaminants around the world. In this study, the concentrations of PCBs in soil, the air-soil exchange of PCBs, and the soil-air partitioning coefficient (K SA ) of PCBs were investigated in four noteworthy urban areas in Nepal. Overall, the concentrations of ∑ 30 PCBs ranged from 10 to 59.4ng/g dry weight; dw (mean 12.2ng/g ±11.2ng/g dw). The hexa-CBs (22-31%) was most dominant among several PCB-homologues, followed by tetra-CBs (20-29%), hepta-CBs (12-21%), penta-CBs (15-17%) and tri-CBs (9-19%). The sources of elevated level of PCBs discharge in Nepalese soil was identified as emission from transformer oil, lubricants, breaker oil, cutting oil and paints, and cable insulation. Slightly strong correlation of PCBs with TOC than BC demonstrated that amorphous organic matter (AOM) assumes a more critical part in holding of PCBs than BC in Nepalese soil. The fugacity fraction (ff) results indicated the soil being the source of PCB in air through volatilization and net transport from soil to air. The soil-air partitioning coefficient study suggests the absorption by soil organic matter control soil-air partitioning of PCBs. Slightly weak but positive correlation of measured Log K SA with Log K OA (R 2 = 0.483) and Log K BC-A (R 2 = 0.438) suggests that both Log K OA and Log K BC-A can predict soil-air partitioning to lesser extent for PCBs. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Holland, Frederic A., Jr.
2004-01-01
Modern engineering design practices are tending more toward the treatment of design parameters as random variables as opposed to fixed, or deterministic, values. The probabilistic design approach attempts to account for the uncertainty in design parameters by representing them as a distribution of values rather than as a single value. The motivations for this effort include preventing excessive overdesign as well as assessing and assuring reliability, both of which are important for aerospace applications. However, the determination of the probability distribution is a fundamental problem in reliability analysis. A random variable is often defined by the parameters of the theoretical distribution function that gives the best fit to experimental data. In many cases the distribution must be assumed from very limited information or data. Often the types of information that are available or reasonably estimated are the minimum, maximum, and most likely values of the design parameter. For these situations the beta distribution model is very convenient because the parameters that define the distribution can be easily determined from these three pieces of information. Widely used in the field of operations research, the beta model is very flexible and is also useful for estimating the mean and standard deviation of a random variable given only the aforementioned three values. However, an assumption is required to determine the four parameters of the beta distribution from only these three pieces of information (some of the more common distributions, like the normal, lognormal, gamma, and Weibull distributions, have two or three parameters). The conventional method assumes that the standard deviation is a certain fraction of the range. The beta parameters are then determined by solving a set of equations simultaneously. A new method developed in-house at the NASA Glenn Research Center assumes a value for one of the beta shape parameters based on an analogy with the normal distribution (ref.1). This new approach allows for a very simple and direct algebraic solution without restricting the standard deviation. The beta parameters obtained by the new method are comparable to the conventional method (and identical when the distribution is symmetrical). However, the proposed method generally produces a less peaked distribution with a slightly larger standard deviation (up to 7 percent) than the conventional method in cases where the distribution is asymmetric or skewed. The beta distribution model has now been implemented into the Fast Probability Integration (FPI) module used in the NESSUS computer code for probabilistic analyses of structures (ref. 2).
Predicting the probability of slip in gait: methodology and distribution study.
Gragg, Jared; Yang, James
2016-01-01
The likelihood of a slip is related to the available and required friction for a certain activity, here gait. Classical slip and fall analysis presumed that a walking surface was safe if the difference between the mean available and required friction coefficients exceeded a certain threshold. Previous research was dedicated to reformulating the classical slip and fall theory to include the stochastic variation of the available and required friction when predicting the probability of slip in gait. However, when predicting the probability of a slip, previous researchers have either ignored the variation in the required friction or assumed the available and required friction to be normally distributed. Also, there are no published results that actually give the probability of slip for various combinations of required and available frictions. This study proposes a modification to the equation for predicting the probability of slip, reducing the previous equation from a double-integral to a more convenient single-integral form. Also, a simple numerical integration technique is provided to predict the probability of slip in gait: the trapezoidal method. The effect of the random variable distributions on the probability of slip is also studied. It is shown that both the required and available friction distributions cannot automatically be assumed as being normally distributed. The proposed methods allow for any combination of distributions for the available and required friction, and numerical results are compared to analytical solutions for an error analysis. The trapezoidal method is shown to be highly accurate and efficient. The probability of slip is also shown to be sensitive to the input distributions of the required and available friction. Lastly, a critical value for the probability of slip is proposed based on the number of steps taken by an average person in a single day.
Relationships between log N-log S and celestial distribution of gamma-ray bursts
NASA Technical Reports Server (NTRS)
Nishimura, J.; Yamagami, T.
1985-01-01
The apparent conflict between log N-log S curve and isotropic celestial distribution of the gamma ray bursts is discussed. A possible selection effect due to the time profile of each burst is examined. It is shown that the contradiction is due to this selection effect of the gamma ray bursts.
Multiwavelength Studies of Rotating Radio Transients
NASA Astrophysics Data System (ADS)
Miller, Joshua J.
Seven years ago, a new class of pulsars called the Rotating Radio Transients (RRATs) was discovered with the Parkes radio telescope in Australia (McLaughlin et al., 2006). These neutron stars are characterized by strong radio bursts at repeatable dispersion measures, but not detectable using standard periodicity-search algorithms. We now know of roughly 100 of these objects, discovered in new surveys and re-analysis of archival survey data. They generally have longer periods than those of the normal pulsar population, and several have high magnetic fields, similar to those other neutron star populations like the X-ray bright magnetars. However, some of the RRATs have spin-down properties very similar to those of normal pulsars, making it difficult to determine the cause of their unusual emission and possible evolutionary relationships between them and other classes of neutron stars. We have calculated single-pulse flux densities for eight RRAT sources observed using the Parkes radio telescope. Like normal pulsars, the pulse amplitude distributions are well described by log-normal probability distribution functions, though two show evidence for an additional power-law tail. Spectral indices are calculated for the seven RRATs which were detected at multiple frequencies. These RRATs have a mean spectral index of
Economic values under inappropriate normal distribution assumptions.
Sadeghi-Sefidmazgi, A; Nejati-Javaremi, A; Moradi-Shahrbabak, M; Miraei-Ashtiani, S R; Amer, P R
2012-08-01
The objectives of this study were to quantify the errors in economic values (EVs) for traits affected by cost or price thresholds when skewed or kurtotic distributions of varying degree are assumed to be normal and when data with a normal distribution is subject to censoring. EVs were estimated for a continuous trait with dichotomous economic implications because of a price premium or penalty arising from a threshold ranging between -4 and 4 standard deviations from the mean. In order to evaluate the impacts of skewness, positive and negative excess kurtosis, standard skew normal, Pearson and the raised cosine distributions were used, respectively. For the various evaluable levels of skewness and kurtosis, the results showed that EVs can be underestimated or overestimated by more than 100% when price determining thresholds fall within a range from the mean that might be expected in practice. Estimates of EVs were very sensitive to censoring or missing data. In contrast to practical genetic evaluation, economic evaluation is very sensitive to lack of normality and missing data. Although in some special situations, the presence of multiple thresholds may attenuate the combined effect of errors at each threshold point, in practical situations there is a tendency for a few key thresholds to dominate the EV, and there are many situations where errors could be compounded across multiple thresholds. In the development of breeding objectives for non-normal continuous traits influenced by value thresholds, it is necessary to select a transformation that will resolve problems of non-normality or consider alternative methods that are less sensitive to non-normality.
Dou, Z; Chen, J; Jiang, Z; Song, W L; Xu, J; Wu, Z Y
2017-11-10
Objective: To understand the distribution of population viral load (PVL) data in HIV infected men who have sex with men (MSM), fit distribution function and explore the appropriate estimating parameter of PVL. Methods: The detection limit of viral load (VL) was ≤ 50 copies/ml. Box-Cox transformation and normal distribution tests were used to describe the general distribution characteristics of the original and transformed data of PVL, then the stable distribution function was fitted with test of goodness of fit. Results: The original PVL data fitted a skewed distribution with the variation coefficient of 622.24%, and had a multimodal distribution after Box-Cox transformation with optimal parameter ( λ ) of-0.11. The distribution of PVL data over the detection limit was skewed and heavy tailed when transformed by Box-Cox with optimal λ =0. By fitting the distribution function of the transformed data over the detection limit, it matched the stable distribution (SD) function ( α =1.70, β =-1.00, γ =0.78, δ =4.03). Conclusions: The original PVL data had some censored data below the detection limit, and the data over the detection limit had abnormal distribution with large degree of variation. When proportion of the censored data was large, it was inappropriate to use half-value of detection limit to replace the censored ones. The log-transformed data over the detection limit fitted the SD. The median ( M ) and inter-quartile ranger ( IQR ) of log-transformed data can be used to describe the centralized tendency and dispersion tendency of the data over the detection limit.
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients
Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil
2018-03-27
Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License
Bellanger, Martine; Pichery, Céline; Aerts, Dominique; Berglund, Marika; Castaño, Argelia; Cejchanová, Mája; Crettaz, Pierre; Davidson, Fred; Esteban, Marta; Fischer, Marc E; Gurzau, Anca Elena; Halzlova, Katarina; Katsonouri, Andromachi; Knudsen, Lisbeth E; Kolossa-Gehring, Marike; Koppen, Gudrun; Ligocka, Danuta; Miklavčič, Ana; Reis, M Fátima; Rudnai, Peter; Tratnik, Janja Snoj; Weihe, Pál; Budtz-Jørgensen, Esben; Grandjean, Philippe
2013-01-07
Due to global mercury pollution and the adverse health effects of prenatal exposure to methylmercury (MeHg), an assessment of the economic benefits of prevented developmental neurotoxicity is necessary for any cost-benefit analysis. Distributions of hair-Hg concentrations among women of reproductive age were obtained from the DEMOCOPHES project (1,875 subjects in 17 countries) and literature data (6,820 subjects from 8 countries). The exposures were assumed to comply with log-normal distributions. Neurotoxicity effects were estimated from a linear dose-response function with a slope of 0.465 Intelligence Quotient (IQ) point reduction per μg/g increase in the maternal hair-Hg concentration during pregnancy, assuming no deficits below a hair-Hg limit of 0.58 μg/g thought to be safe. A logarithmic IQ response was used in sensitivity analyses. The estimated IQ benefit cost was based on lifetime income, adjusted for purchasing power parity. The hair-mercury concentrations were the highest in Southern Europe and lowest in Eastern Europe. The results suggest that, within the EU, more than 1.8 million children are born every year with MeHg exposures above the limit of 0.58 μg/g, and about 200,000 births exceed a higher limit of 2.5 μg/g proposed by the World Health Organization (WHO). The total annual benefits of exposure prevention within the EU were estimated at more than 600,000 IQ points per year, corresponding to a total economic benefit between €8,000 million and €9,000 million per year. About four-fold higher values were obtained when using the logarithmic response function, while adjustment for productivity resulted in slightly lower total benefits. These calculations do not include the less tangible advantages of protecting brain development against neurotoxicity or any other adverse effects. These estimates document that efforts to combat mercury pollution and to reduce MeHg exposures will have very substantial economic benefits in Europe, mainly in southern countries. Some data may not be entirely representative, some countries were not covered, and anticipated changes in mercury pollution all suggest a need for extended biomonitoring of human MeHg exposure.
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Geophysical evaluation of sandstone aquifers in the Reconcavo-Tucano Basin, Bahia -- Brazil
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lima, O.A.L. de
1993-11-01
The upper clastic sediments in the Reconcavo-Tucano basin comprise a multilayer aquifer system of Jurassic age. Its groundwater is normally fresh down to depths of more than 1,000 m. Locally, however, there are zones producing high salinity or sulfur geothermal water. Analysis of electrical logs of more than 150 wells enabled the identification of the most typical sedimentary structures and the gross geometries for the sandstone units in selected areas of the basin. Based on this information, the thick sands are interpreted as coalescent point bars and the shales as flood plain deposits of a large fluvial environment. The resistivitymore » logs and core laboratory data are combined to develop empirical equations relating aquifer porosity and permeability to log-derived parameters such as formation factor and cementation exponent. Temperature logs of 15 wells were useful to quantify the water leakage through semiconfining shales. The groundwater quality was inferred from spontaneous potential (SP) log deflections under control of chemical analysis of water samples. An empirical chart is developed that relates the SP-derived water resistivity to the true water resistivity within the formations. The patterns of salinity variation with depth inferred from SP logs were helpful in identifying subsurface flows along major fault zones, where extensive mixing of water is taking place. A total of 49 vertical Schlumberger resistivity soundings aid in defining aquifer structures and in extrapolating the log derived results. Transition zones between fresh and saline waters have also been detected based on a combination of logging and surface sounding data. Ionic filtering by water leakage across regional shales, local convection and mixing along major faults and hydrodynamic dispersion away from lateral permeability contrasts are the main mechanisms controlling the observed distributions of salinity and temperature within the basin.« less
Assessing cadmium exposure risks of vegetables with plant uptake factor and soil property.
Yang, Yang; Chang, Andrew C; Wang, Meie; Chen, Weiping; Peng, Chi
2018-07-01
Plant uptake factors (PUFs) are of great importance in human cadmium (Cd) exposure risk assessment while it has been often treated in a generic way. We collected 1077 pairs of vegetable-soil samples from production fields to characterize Cd PUFs and demonstrated their utility in assessing Cd exposure risks to consumers of locally grown vegetables. The Cd PUFs varied with plant species and pH and organic matter content of soils. Once normalized PUFs against soil parameters, the PUFs distributions were log-normal in nature. In this manner, the PUFs were represented by definable probability distributions instead of a deterministic figure. The Cd exposure risks were then assessed using the normalized PUF based on the Monte Carlo simulation algorithm. Factors affecting the extent of Cd exposures were isolated through sensitivity analyses. Normalized PUF would illustrate the outcomes for uncontaminated and slightly contaminated soils. Among the vegetables, lettuce was potentially hazardous for residents due to its high Cd accumulation but low Zn concentration. To protect 95% of the lettuce production from causing excessive Cd exposure risks, pH of soils needed to be 5.9 and above. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mesh size selectivity of the gillnet in East China Sea
NASA Astrophysics Data System (ADS)
Li, L. Z.; Tang, J. H.; Xiong, Y.; Huang, H. L.; Wu, L.; Shi, J. J.; Gao, Y. S.; Wu, F. Q.
2017-07-01
A production test using several gillnets with various mesh sizes was carried out to discover the selectivity of gillnets in the East China Sea. The result showed that the composition of the catch species was synthetically affected by panel height and mesh size. The bycatch species of the 10-m nets were more than those of the 6-m nets. For target species, the effect of panel height on juvenile fish was ambiguous, but the number of juvenile fish declined quickly with the increase in mesh size. According to model deviance (D) and Akaike’s information criterion, the bi-normal model provided the best fit for small yellow croaker (Larimichthy polyactis), and the relative retention was 0.2 and 1, respectively. For Chelidonichthys spinosus, the log-normal was the best model; the right tilt of the selectivity curve was obvious and well coincided with the original data. The contact population of small yellow croaker showed a bi-normal distribution, and body lengths ranged from 95 to 215 mm. The contact population of C. spinosus showed a normal distribution, and the body lengths ranged from 95 to 205 mm. These results can provide references for coastal fishery management.
Collective purchase behavior toward retail price changes
NASA Astrophysics Data System (ADS)
Ueno, Hiromichi; Watanabe, Tsutomu; Takayasu, Hideki; Takayasu, Misako
2011-02-01
By analyzing a huge amount of point-of-sale data collected from Japanese supermarkets, we find power law relationships between price and sales numbers. The estimated values of the exponents of these power laws depend on the category of products; however, they are independent of the stores, thereby implying the existence of universal human purchase behavior. The rate of sales numbers around these power laws are generally approximated by log-normal distributions implying that there are hidden random parameters, which might proportionally affect the purchase activity.
Daily Magnesium Intake and Serum Magnesium Concentration among Japanese People
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
Background The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. Methods The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. Results The mean (±standard deviation) daily magnesium intake was 322 (±132), 323 (±163), and 322 (±147) mg/day for men, women, and the entire group, respectively. The mean (±standard deviation) serum magnesium concentration was 20.69 (±2.83), 20.69 (±2.88), and 20.69 (±2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log10X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. Conclusion The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed. PMID:18635902
Testing the shape of distributions of weather data
NASA Astrophysics Data System (ADS)
Baccon, Ana L. P.; Lunardi, José T.
2016-08-01
The characterization of the statistical distributions of observed weather data is of crucial importance both for the construction and for the validation of weather models, such as weather generators (WG's). An important class of WG's (e.g., the Richardson-type generators) reduce the time series of each variable to a time series of its residual elements, and the residuals are often assumed to be normally distributed. In this work we propose an approach to investigate if the shape assumed for the distribution of residuals is consistent or not with the observed data of a given site. Specifically, this procedure tests if the same distribution shape for the residuals noise is maintained along the time. The proposed approach is an adaptation to climate time series of a procedure first introduced to test the shapes of distributions of growth rates of business firms aggregated in large panels of short time series. We illustrate the procedure by applying it to the residuals time series of maximum temperature in a given location, and investigate the empirical consistency of two assumptions, namely i) the most common assumption that the distribution of the residuals is Gaussian and ii) that the residuals noise has a time invariant shape which coincides with the empirical distribution of all the residuals noise of the whole time series pooled together.
Jimsphere wind and turbulence exceedance statistic
NASA Technical Reports Server (NTRS)
Adelfang, S. I.; Court, A.
1972-01-01
Exceedance statistics of winds and gusts observed over Cape Kennedy with Jimsphere balloon sensors are described. Gust profiles containing positive and negative departures, from smoothed profiles, in the wavelength ranges 100-2500, 100-1900, 100-860, and 100-460 meters were computed from 1578 profiles with four 41 weight digital high pass filters. Extreme values of the square root of gust speed are normally distributed. Monthly and annual exceedance probability distributions of normalized rms gust speeds in three altitude bands (2-7, 6-11, and 9-14 km) are log-normal. The rms gust speeds are largest in the 100-2500 wavelength band between 9 and 14 km in late winter and early spring. A study of monthly and annual exceedance probabilities and the number of occurrences per kilometer of level crossings with positive slope indicates significant variability with season, altitude, and filter configuration. A decile sampling scheme is tested and an optimum approach is suggested for drawing a relatively small random sample that represents the characteristic extreme wind speeds and shears of a large parent population of Jimsphere wind profiles.
NASA Technical Reports Server (NTRS)
Goldhirsh, Julius; Gebo, Norman; Rowland, John
1988-01-01
In this effort are described cumulative rain rate distributions for a network of nine tipping bucket rain gauge systems located in the mid-Atlantic coast region in the vicinity of the NASA Wallops Flight Facility, Wallops Island, Virginia. The rain gauges are situated within a gridded region of dimensions of 47 km east-west by 70 km north-south. Distributions are presented for the individual site measurements and the network average for the year period June 1, 1986 through May 31, 1987. A previous six year average distribution derived from measurements at one of the site locations is also presented. Comparisons are given of the network average, the CCIR (International Radio Consultative Committee) climatic zone, and the CCIR functional model distributions, the latter of which approximates a log normal at the lower rain rate and a gamma function at the higher rates.
Parameter Recovery for the 1-P HGLLM with Non-Normally Distributed Level-3 Residuals
ERIC Educational Resources Information Center
Kara, Yusuf; Kamata, Akihito
2017-01-01
A multilevel Rasch model using a hierarchical generalized linear model is one approach to multilevel item response theory (IRT) modeling and is referred to as a one-parameter hierarchical generalized linear logistic model (1-P HGLLM). Although it has the flexibility to model nested structure of data with covariates, the model assumes the normality…
Strange Data: When the Numbers Just Aren't Normal.
Jupiter, Daniel C
2015-01-01
Many statistical tests assume that the populations from which we draw our data samples roughly follow a given probability distribution. Here, I review what these assumptions mean, why they are important, and how to deal with situations where the assumptions are not met. Copyright © 2015 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Lithium in Stellar Atmospheres: Observations and Theory
NASA Astrophysics Data System (ADS)
Lyubimkov, L. S.
2016-09-01
Of all the light elements, lithium is the most sensitive indicator of stellar evolution. This review discusses current data on the abundance of lithium in the atmospheres of A-, F-, G-, and K-stars of different types, as well as the consistency of these data with theoretical predictions. The variety of observed Li abundances is illustrated by the following objects in different stages of evolution: (1) Old stars in the galactic halo, which have a lithium abundance logɛ(Li)=2.2 (the "lithium plateau") that appears to be 0.5 dex lower than the primordial abundance predicted by cosmological models. (2) Young stars in the galactic disk, which have been used to estimate the contemporary initial lithium abundance logɛ(Li)=3.2±0.1 for stars in the Main sequence. Possible sources of lithium enrichment in the interstellar medium during evolution of the galaxy are discussed. (3) Evolving FGK dwarfs in the galactic disk, which have lower logɛ(Li) for lower effective temperature T eff and mass M. The "lithium dip" near T eff ~6600 K in the distribution of logɛ(Li) with respect to T eff in old clusters is discussed. (4) FGK giants and supergiants, of which most have no lithium at all. This phenomenon is consistent with rotating star model calculations. (5) Lithium rich cold giants with logɛ(Li) ≥ 2.0, which form a small, enigmatic group. Theoretical models with rotation can explain the existence of these stars only in the case of low initial rotation velocities V 0 <50 km/s. In all other cases it is necessary to assume recent synthesis of lithium (capture of a giant planet is an alternative). (6) Magnetic Ap-stars, where lithium is concentrated in spots located at the magnetic poles. There the lithium abundance reaches logɛ(Li)=6. Discrepancies between observations and theory are noted for almost all the stars discussed in this review.
SU-E-T-664: Radiobiological Modeling of Prophylactic Cranial Irradiation in Mice
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, D; Debeb, B; Woodward, W
Purpose: Prophylactic cranial irradiation (PCI) is a clinical technique used to reduce the incidence of brain metastasis and improve overall survival in select patients with ALL and SCLC, and we have shown the potential of PCI in select breast cancer patients through a mouse model (manuscript in preparation). We developed a computational model using our experimental results to demonstrate the advantage of treating brain micro-metastases early. Methods: MATLAB was used to develop the computational model of brain metastasis and PCI in mice. The number of metastases per mouse and the volume of metastases from four- and eight-week endpoints were fitmore » to normal and log-normal distributions, respectively. Model input parameters were optimized so that model output would match the experimental number of metastases per mouse. A limiting dilution assay was performed to validate the model. The effect of radiation at different time points was computationally evaluated through the endpoints of incidence, number of metastases, and tumor burden. Results: The correlation between experimental number of metastases per mouse and the Gaussian fit was 87% and 66% at the two endpoints. The experimental volumes and the log-normal fit had correlations of 99% and 97%. In the optimized model, the correlation between number of metastases per mouse and the Gaussian fit was 96% and 98%. The log-normal volume fit and the model agree 100%. The model was validated by a limiting dilution assay, where the correlation was 100%. The model demonstrates that cells are very sensitive to radiation at early time points, and delaying treatment introduces a threshold dose at which point the incidence and number of metastases decline. Conclusion: We have developed a computational model of brain metastasis and PCI in mice that is highly correlated to our experimental data. The model shows that early treatment of subclinical disease is highly advantageous.« less
Edwards, David P.; Larsen, Trond H.; Docherty, Teegan D. S.; Ansell, Felicity A.; Hsu, Wayne W.; Derhé, Mia A.; Hamer, Keith C.; Wilcove, David S.
2011-01-01
Southeast Asia is a hotspot of imperilled biodiversity, owing to extensive logging and forest conversion to oil palm agriculture. The degraded forests that remain after multiple rounds of intensive logging are often assumed to be of little conservation value; consequently, there has been no concerted effort to prevent them from being converted to oil palm. However, no study has quantified the biodiversity of repeatedly logged forests. We compare the species richness and composition of birds and dung beetles within unlogged (primary), once-logged and twice-logged forests in Sabah, Borneo. Logging had little effect on the overall richness of birds. Dung beetle richness declined following once-logging but did not decline further after twice-logging. The species composition of bird and dung beetle communities was altered, particularly after the second logging rotation, but globally imperilled bird species (IUCN Red List) did not decline further after twice-logging. Remarkably, over 75 per cent of bird and dung beetle species found in unlogged forest persisted within twice-logged forest. Although twice-logged forests have less biological value than primary and once-logged forests, they clearly provide important habitat for numerous bird and dung beetle species. Preventing these degraded forests from being converted to oil palm should be a priority of policy-makers and conservationists. PMID:20685713
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gunter, Dan; Lee, Jason; Stoufer, Martin
2003-03-28
The NetLogger Toolkit is designed to monitor, under actual operating conditions, the behavior of all the elements of the application-to-application communication path in order to determine exactly where time is spent within a complex system Using NetLogger, distnbuted application components are modified to produce timestamped logs of "interesting" events at all the critical points of the distributed system Events from each component are correlated, which allov^ one to characterize the performance of all aspects of the system and network in detail. The NetLogger Toolkit itself consists of four components an API and library of functions to simplify the generation ofmore » application-level event logs, a set of tools for collecting and sorting log files, an event archive system, and a tool for visualization and analysis of the log files In order to instrument an application to produce event logs, the application developer inserts calls to the NetLogger API at all the critical points in the code, then links the application with the NetLogger library All the tools in the NetLogger Toolkit share a common log format, and assume the existence of accurate and synchronized system clocks NetLogger messages can be logged using an easy-to-read text based format based on the lETF-proposed ULM format, or a binary format that can still be used through the same API but that is several times faster and smaller, with performance comparable or better than binary message formats such as MPI, XDR, SDDF-Binary, and PBIO. The NetLogger binary format is both highly efficient and self-describing, thus optimized for the dynamic message construction and parsing of application instrumentation. NetLogger includes an "activation" API that allows NetLogger logging to be turned on, off, or modified by changing an external file This IS useful for activating logging in daemons/services (e g GndFTP server). The NetLogger reliability API provides the ability to specify backup logging locations and penodically try to reconnect broken TCP pipe. A typical use for this is to store data on local disk while net is down. An event archiver can log one or more incoming NetLogger streams to a local disk file (netlogd) or to a mySQL database (netarchd). We have found exploratory, visual analysis of the log event data to be the most useful means of determining the causes of performance anomalies The NetLogger Visualization tool, niv, has been developed to provide a flexible and interactive graphical representation of system-level and application-level events.« less
Vicente J. Monleon
2009-01-01
Currently, Forest Inventory and Analysis estimation procedures use Smalian's formula to compute coarse woody debris (CWD) volume and assume that logs lie horizontally on the ground. In this paper, the impact of those assumptions on volume and biomass estimates is assessed using 7 years of Oregon's Phase 2 data. Estimates of log volume computed using Smalian...
Statistical characterization of a large geochemical database and effect of sample size
Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.
2005-01-01
The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively smaller numbers of data points showed that few elements passed standard statistical tests for normality or log-normality until sample size decreased to a few hundred data points. Large sample size enhances the power of statistical tests, and leads to rejection of most statistical hypotheses for real data sets. For large sample sizes (e.g., n > 1000), graphical methods such as histogram, stem-and-leaf, and probability plots are recommended for rough judgement of probability distribution if needed. ?? 2005 Elsevier Ltd. All rights reserved.
A Box-Cox normal model for response times.
Klein Entink, R H; van der Linden, W J; Fox, J-P
2009-11-01
The log-transform has been a convenient choice in response time modelling on test items. However, motivated by a dataset of the Medical College Admission Test where the lognormal model violated the normality assumption, the possibilities of the broader class of Box-Cox transformations for response time modelling are investigated. After an introduction and an outline of a broader framework for analysing responses and response times simultaneously, the performance of a Box-Cox normal model for describing response times is investigated using simulation studies and a real data example. A transformation-invariant implementation of the deviance information criterium (DIC) is developed that allows for comparing model fit between models with different transformation parameters. Showing an enhanced description of the shape of the response time distributions, its application in an educational measurement context is discussed at length.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Measuring Resistance to Change at the Within-Session Level
ERIC Educational Resources Information Center
Tonneau, Francois; Rios, Americo; Cabrera, Felipe
2006-01-01
Resistance to change is often studied by measuring response rate in various components of a multiple schedule. Response rate in each component is normalized (that is, divided by its baseline level) and then log-transformed. Differential resistance to change is demonstrated if the normalized, log-transformed response rate in one component decreases…
Phenomenology of wall-bounded Newtonian turbulence.
L'vov, Victor S; Pomyalov, Anna; Procaccia, Itamar; Zilitinkevich, Sergej S
2006-01-01
We construct a simple analytic model for wall-bounded turbulence, containing only four adjustable parameters. Two of these parameters are responsible for the viscous dissipation of the components of the Reynolds stress tensor. The other two parameters control the nonlinear relaxation of these objects. The model offers an analytic description of the profiles of the mean velocity and the correlation functions of velocity fluctuations in the entire boundary region, from the viscous sublayer, through the buffer layer, and further into the log-law turbulent region. In particular, the model predicts a very simple distribution of the turbulent kinetic energy in the log-law region between the velocity components: the streamwise component contains a half of the total energy whereas the wall-normal and cross-stream components contain a quarter each. In addition, the model predicts a very simple relation between the von Kármán slope k and the turbulent velocity in the log-law region v+ (in wall units): v+=6k. These predictions are in excellent agreement with direct numerical simulation data and with recent laboratory experiments.
Zielińska, Anna; Oleszczuk, Patryk
2015-09-01
The present study investigated the sorption of phenanthrene (PHE) and pyrene (PYR) by sewage sludges and sewage sludge-derived biochars. The organic carbon normalized distribution coefficient (log K(OC) for C(w) = 0.01 S(w)) for the sewage sludges ranged from 5.62 L kg(-1) to 5.64 L kg(-1) for PHE and from 5.72 L kg(-1) to 5.75 L kg(-1) for PYR. The conversion of sewage sludges into biochar significantly increased their sorption capacity. The value of log K(OC) for the biochars ranged from 5.54 L kg(-1) to 6.23 L kg(-1) for PHE and from 5.95 L kg(-1) to 6.52 L kg(-1) for PYR depending on temperature of pyrolysis. The dominant process was monolayer adsorption in the micropores and/or multilayer surface adsorption (in the mesopores), which was indicated by the significant correlations between log K(OC) and surface properties of biochars. PYR was sorbed better on the tested materials than PHE. Copyright © 2015 Elsevier Ltd. All rights reserved.
Asymptotic confidence intervals for the Pearson correlation via skewness and kurtosis.
Bishara, Anthony J; Li, Jiexiang; Nash, Thomas
2018-02-01
When bivariate normality is violated, the default confidence interval of the Pearson correlation can be inaccurate. Two new methods were developed based on the asymptotic sampling distribution of Fisher's z' under the general case where bivariate normality need not be assumed. In Monte Carlo simulations, the most successful of these methods relied on the (Vale & Maurelli, 1983, Psychometrika, 48, 465) family to approximate a distribution via the marginal skewness and kurtosis of the sample data. In Simulation 1, this method provided more accurate confidence intervals of the correlation in non-normal data, at least as compared to no adjustment of the Fisher z' interval, or to adjustment via the sample joint moments. In Simulation 2, this approximate distribution method performed favourably relative to common non-parametric bootstrap methods, but its performance was mixed relative to an observed imposed bootstrap and two other robust methods (PM1 and HC4). No method was completely satisfactory. An advantage of the approximate distribution method, though, is that it can be implemented even without access to raw data if sample skewness and kurtosis are reported, making the method particularly useful for meta-analysis. Supporting information includes R code. © 2017 The British Psychological Society.
Spatial organization of surface nanobubbles and its implications in their formation process.
Lhuissier, Henri; Lohse, Detlef; Zhang, Xuehua
2014-02-21
We study the size and spatial distribution of surface nanobubbles formed by the solvent exchange method to gain insight into the mechanism of their formation. The analysis of Atomic Force Microscopy (AFM) images of nanobubbles formed on a hydrophobic surface reveals that the nanobubbles are not randomly located, which we attribute to the role of the history of nucleation during the formation. Moreover, the size of each nanobubble is found to be strongly correlated with the area of the bubble-depleted zone around it. The precise correlation suggests that the nanobubbles grow by diffusion of the gas from the bulk rather than by diffusion of the gas adsorbed on the surface. Lastly, the size distribution of the nanobubbles is found to be well described by a log-normal distribution.
NASA Astrophysics Data System (ADS)
Diddens, D.; Brodeck, M.; Heuer, A.
2011-09-01
Within polymer blends composed of two species with largely different glass transition temperatures like PEO/PMMA, the dynamics of the fast PEO component is severely affected by the rather immobile PMMA, reflected by a breakdown of the typical Rouse scaling. The phenomenological random Rouse model (RRM), in which each monomer has an individual mobility obeying a broad log-normal distribution, has been applied to these blends. Using a newly developed method, we extract the distribution of friction coefficients from MD simulations of a PEO/PMMA blend, thereby testing the RRM explicitly. In our simulations we observe that the distribution is much narrower than expected from the RRM. Here, rather, the presence of additional forward-backward correlations of intermolecular origin is responsible for the anomalous PEO behavior.
Woolridge, Helen; Williams, John; Cronin, Anna; Evans, Nicola; Steventon, Glyn B
2004-01-01
The use of caffeine as a probe for CYP1A2 phenotyping has been extensively investigated over the last 25 years. Numerous metabolic ratios have been employed and various biological fluids analysed for caffeine and its metabolites. These investigations have used non-smoking, smoking and numerous disease populations to investigate the role of CYP1A2 in possible disease aetiology and for induction and inhibition studies in vivo using dietary, environmental and pharmaceutical compounds. This investigation found that the 17X/137X CYP1A2 metabolic ratio in a 5 h saliva sample and 0-5 h urine collection was not normally distributed in both a non-smoking and a smoking population. The urinary and salivary CYP1A2 metabolic ratio was log normally distributed in the non-smoking population but the smoking population showed a bi- (or tri-)modal distribution on log transformation of both the urinary and salivary CYP1A2 metabolic ratios. The CYP1A2 metabolic ratios were significantly higher in the smoking population compared to the non-smoking population when both the urinary and salivary CYP1A2 metabolic ratios were analysed. These results indicate that urinary flow rate was not a factor in the variation in CYP1A2 phenotype in the non-smoking and smoking populations studied here. The increased CYP1A2 activity in the smoking population was probably due to induction of the CYP1A2 gene via the Ah receptor causing an increase in the concentration of CYP1A2 protein.
Twitter-Based Analysis of the Dynamics of Collective Attention to Political Parties
Eom, Young-Ho; Puliga, Michelangelo; Smailović, Jasmina; Mozetič, Igor; Caldarelli, Guido
2015-01-01
Large-scale data from social media have a significant potential to describe complex phenomena in the real world and to anticipate collective behaviors such as information spreading and social trends. One specific case of study is represented by the collective attention to the action of political parties. Not surprisingly, researchers and stakeholders tried to correlate parties' presence on social media with their performances in elections. Despite the many efforts, results are still inconclusive since this kind of data is often very noisy and significant signals could be covered by (largely unknown) statistical fluctuations. In this paper we consider the number of tweets (tweet volume) of a party as a proxy of collective attention to the party, identify the dynamics of the volume, and show that this quantity has some information on the election outcome. We find that the distribution of the tweet volume for each party follows a log-normal distribution with a positive autocorrelation of the volume over short terms, which indicates the volume has large fluctuations of the log-normal distribution yet with a short-term tendency. Furthermore, by measuring the ratio of two consecutive daily tweet volumes, we find that the evolution of the daily volume of a party can be described by means of a geometric Brownian motion (i.e., the logarithm of the volume moves randomly with a trend). Finally, we determine the optimal period of averaging tweet volume for reducing fluctuations and extracting short-term tendencies. We conclude that the tweet volume is a good indicator of parties' success in the elections when considered over an optimal time window. Our study identifies the statistical nature of collective attention to political issues and sheds light on how to model the dynamics of collective attention in social media. PMID:26161795
Twitter-Based Analysis of the Dynamics of Collective Attention to Political Parties.
Eom, Young-Ho; Puliga, Michelangelo; Smailović, Jasmina; Mozetič, Igor; Caldarelli, Guido
2015-01-01
Large-scale data from social media have a significant potential to describe complex phenomena in the real world and to anticipate collective behaviors such as information spreading and social trends. One specific case of study is represented by the collective attention to the action of political parties. Not surprisingly, researchers and stakeholders tried to correlate parties' presence on social media with their performances in elections. Despite the many efforts, results are still inconclusive since this kind of data is often very noisy and significant signals could be covered by (largely unknown) statistical fluctuations. In this paper we consider the number of tweets (tweet volume) of a party as a proxy of collective attention to the party, identify the dynamics of the volume, and show that this quantity has some information on the election outcome. We find that the distribution of the tweet volume for each party follows a log-normal distribution with a positive autocorrelation of the volume over short terms, which indicates the volume has large fluctuations of the log-normal distribution yet with a short-term tendency. Furthermore, by measuring the ratio of two consecutive daily tweet volumes, we find that the evolution of the daily volume of a party can be described by means of a geometric Brownian motion (i.e., the logarithm of the volume moves randomly with a trend). Finally, we determine the optimal period of averaging tweet volume for reducing fluctuations and extracting short-term tendencies. We conclude that the tweet volume is a good indicator of parties' success in the elections when considered over an optimal time window. Our study identifies the statistical nature of collective attention to political issues and sheds light on how to model the dynamics of collective attention in social media.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Hualin, E-mail: hualin.zhang@northwestern.edu; Donnelly, Eric D.; Strauss, Jonathan B.
Purpose: To evaluate high-dose-rate (HDR) vaginal cuff brachytherapy (VCBT) in the treatment of endometrial cancer in a cylindrical target volume with either a varied or a constant cancer cell distributions using the linear quadratic (LQ) model. Methods: A Monte Carlo (MC) technique was used to calculate the 3D dose distribution of HDR VCBT over a variety of cylinder diameters and treatment lengths. A treatment planning system (TPS) was used to make plans for the various cylinder diameters, treatment lengths, and prescriptions using the clinical protocol. The dwell times obtained from the TPS were fed into MC. The LQ model wasmore » used to evaluate the therapeutic outcome of two brachytherapy regimens prescribed either at 0.5 cm depth (5.5 Gy × 4 fractions) or at the vaginal mucosal surface (8.8 Gy × 4 fractions) for the treatment of endometrial cancer. An experimentally determined endometrial cancer cell distribution, which showed a varied and resembled a half-Gaussian distribution, was used in radiobiology modeling. The equivalent uniform dose (EUD) to cancer cells was calculated for each treatment scenario. The therapeutic ratio (TR) was defined by comparing VCBT with a uniform dose radiotherapy plan in term of normal cell survival at the same level of cancer cell killing. Calculations of clinical impact were run twice assuming two different types of cancer cell density distributions in the cylindrical target volume: (1) a half-Gaussian or (2) a uniform distribution. Results: EUDs were weakly dependent on cylinder size, treatment length, and the prescription depth, but strongly dependent on the cancer cell distribution. TRs were strongly dependent on the cylinder size, treatment length, types of the cancer cell distributions, and the sensitivity of normal tissue. With a half-Gaussian distribution of cancer cells which populated at the vaginal mucosa the most, the EUDs were between 6.9 Gy × 4 and 7.8 Gy × 4, the TRs were in the range from (5.0){sup 4} to (13.4){sup 4} for the radiosensitive normal tissue depending on the cylinder size, treatment lengths, prescription depth, and dose as well. However, for a uniform cancer cell distribution, the EUDs were between 6.3 Gy × 4 and 7.1 Gy × 4, and the TRs were found to be between (1.4){sup 4} and (1.7){sup 4}. For the uniformly interspersed cancer and radio-resistant normal cells, the TRs were less than 1. The two VCBT prescription regimens were found to be equivalent in terms of EUDs and TRs. Conclusions: HDR VCBT strongly favors cylindrical target volume with the cancer cell distribution following its dosimetric trend. Assuming a half-Gaussian distribution of cancer cells, the HDR VCBT provides a considerable radiobiological advantage over the external beam radiotherapy (EBRT) in terms of sparing more normal tissues while maintaining the same level of cancer cell killing. But for the uniform cancer cell distribution and radio-resistant normal tissue, the radiobiology outcome of the HDR VCBT does not show an advantage over the EBRT. This study strongly suggests that radiation therapy design should consider the cancer cell distribution inside the target volume in addition to the shape of target.« less
Bénet, Thomas; Voirin, Nicolas; Nicolle, Marie-Christine; Picot, Stephane; Michallet, Mauricette; Vanhems, Philippe
2013-02-01
The duration of the incubation of invasive aspergillosis (IA) remains unknown. The objective of this investigation was to estimate the time interval between aplasia onset and that of IA symptoms in acute myeloid leukemia (AML) patients. A single-centre prospective survey (2004-2009) included all patients with AML and probable/proven IA. Parametric survival models were fitted to the distribution of the time intervals between aplasia onset and IA. Overall, 53 patients had IA after aplasia, with the median observed time interval between the two being 15 days. Based on log-normal distribution, the median estimated IA incubation period was 14.6 days (95% CI; 12.8-16.5 days).
Ferragut, Erik M.; Laska, Jason A.; Bridges, Robert A.
2016-06-07
A system is described for receiving a stream of events and scoring the events based on anomalousness and maliciousness (or other classification). The system can include a plurality of anomaly detectors that together implement an algorithm to identify low-probability events and detect atypical traffic patterns. The anomaly detector provides for comparability of disparate sources of data (e.g., network flow data and firewall logs.) Additionally, the anomaly detector allows for regulatability, meaning that the algorithm can be user configurable to adjust a number of false alerts. The anomaly detector can be used for a variety of probability density functions, including normal Gaussian distributions, irregular distributions, as well as functions associated with continuous or discrete variables.
Far-infrared properties of cluster galaxies
NASA Technical Reports Server (NTRS)
Bicay, M. D.; Giovanelli, R.
1987-01-01
Far-infrared properties are derived for a sample of over 200 galaxies in seven clusters: A262, Cancer, A1367, A1656 (Coma), A2147, A2151 (Hercules), and Pegasus. The IR-selected sample consists almost entirely of IR normal galaxies, with Log of L(FIR) = 9.79 solar luminosities, Log of L(FIR)/L(B) = 0,79, and Log of S(100 microns)/S(60 microns) = 0.42. None of the sample galaxies has Log of L(FIR) greater than 11.0 solar luminosities, and only one has a FIR-to-blue luminosity ratio greater than 10. No significant differences are found in the FIR properties of HI-deficient and HI-normal cluster galaxies.
The Italian primary school-size distribution and the city-size: a complex nexus
Belmonte, Alessandro; Di Clemente, Riccardo; Buldyrev, Sergey V.
2014-01-01
We characterize the statistical law according to which Italian primary school-size distributes. We find that the school-size can be approximated by a log-normal distribution, with a fat lower tail that collects a large number of very small schools. The upper tail of the school-size distribution decreases exponentially and the growth rates are distributed with a Laplace PDF. These distributions are similar to those observed for firms and are consistent with a Bose-Einstein preferential attachment process. The body of the distribution features a bimodal shape suggesting some source of heterogeneity in the school organization that we uncover by an in-depth analysis of the relation between schools-size and city-size. We propose a novel cluster methodology and a new spatial interaction approach among schools which outline the variety of policies implemented in Italy. Different regional policies are also discussed shedding lights on the relation between policy and geographical features. PMID:24954714
NASA Astrophysics Data System (ADS)
Farahi, Arya; Evrard, August E.; McCarthy, Ian; Barnes, David J.; Kay, Scott T.
2018-05-01
Using tens of thousands of halos realized in the BAHAMAS and MACSIS simulations produced with a consistent astrophysics treatment that includes AGN feedback, we validate a multi-property statistical model for the stellar and hot gas mass behavior in halos hosting groups and clusters of galaxies. The large sample size allows us to extract fine-scale mass-property relations (MPRs) by performing local linear regression (LLR) on individual halo stellar mass (Mstar) and hot gas mass (Mgas) as a function of total halo mass (Mhalo). We find that: 1) both the local slope and variance of the MPRs run with mass (primarily) and redshift (secondarily); 2) the conditional likelihood, p(Mstar, Mgas| Mhalo, z) is accurately described by a multivariate, log-normal distribution, and; 3) the covariance of Mstar and Mgas at fixed Mhalo is generally negative, reflecting a partially closed baryon box model for high mass halos. We validate the analytical population model of Evrard et al. (2014), finding sub-percent accuracy in the log-mean halo mass selected at fixed property, ⟨ln Mhalo|Mgas⟩ or ⟨ln Mhalo|Mstar⟩, when scale-dependent MPR parameters are employed. This work highlights the potential importance of allowing for running in the slope and scatter of MPRs when modeling cluster counts for cosmological studies. We tabulate LLR fit parameters as a function of halo mass at z = 0, 0.5 and 1 for two popular mass conventions.
Analyzing coastal environments by means of functional data analysis
NASA Astrophysics Data System (ADS)
Sierra, Carlos; Flor-Blanco, Germán; Ordoñez, Celestino; Flor, Germán; Gallego, José R.
2017-07-01
Here we used Functional Data Analysis (FDA) to examine particle-size distributions (PSDs) in a beach/shallow marine sedimentary environment in Gijón Bay (NW Spain). The work involved both Functional Principal Components Analysis (FPCA) and Functional Cluster Analysis (FCA). The grainsize of the sand samples was characterized by means of laser dispersion spectroscopy. Within this framework, FPCA was used as a dimension reduction technique to explore and uncover patterns in grain-size frequency curves. This procedure proved useful to describe variability in the structure of the data set. Moreover, an alternative approach, FCA, was applied to identify clusters and to interpret their spatial distribution. Results obtained with this latter technique were compared with those obtained by means of two vector approaches that combine PCA with CA (Cluster Analysis). The first method, the point density function (PDF), was employed after adapting a log-normal distribution to each PSD and resuming each of the density functions by its mean, sorting, skewness and kurtosis. The second applied a centered-log-ratio (clr) to the original data. PCA was then applied to the transformed data, and finally CA to the retained principal component scores. The study revealed functional data analysis, specifically FPCA and FCA, as a suitable alternative with considerable advantages over traditional vector analysis techniques in sedimentary geology studies.
NASA Technical Reports Server (NTRS)
Herskovits, E. H.; Itoh, R.; Melhem, E. R.
2001-01-01
OBJECTIVE: The objective of our study was to determine the effects of MR sequence (fluid-attenuated inversion-recovery [FLAIR], proton density--weighted, and T2-weighted) and of lesion location on sensitivity and specificity of lesion detection. MATERIALS AND METHODS: We generated FLAIR, proton density-weighted, and T2-weighted brain images with 3-mm lesions using published parameters for acute multiple sclerosis plaques. Each image contained from zero to five lesions that were distributed among cortical-subcortical, periventricular, and deep white matter regions; on either side; and anterior or posterior in position. We presented images of 540 lesions, distributed among 2592 image regions, to six neuroradiologists. We constructed a contingency table for image regions with lesions and another for image regions without lesions (normal). Each table included the following: the reviewer's number (1--6); the MR sequence; the side, position, and region of the lesion; and the reviewer's response (lesion present or absent [normal]). We performed chi-square and log-linear analyses. RESULTS: The FLAIR sequence yielded the highest true-positive rates (p < 0.001) and the highest true-negative rates (p < 0.001). Regions also differed in reviewers' true-positive rates (p < 0.001) and true-negative rates (p = 0.002). The true-positive rate model generated by log-linear analysis contained an additional sequence-location interaction. The true-negative rate model generated by log-linear analysis confirmed these associations, but no higher order interactions were added. CONCLUSION: We developed software with which we can generate brain images of a wide range of pulse sequences and that allows us to specify the location, size, shape, and intrinsic characteristics of simulated lesions. We found that the use of FLAIR sequences increases detection accuracy for cortical-subcortical and periventricular lesions over that associated with proton density- and T2-weighted sequences.
Particle Morphology and Size Results from the Smoke Aerosol Measurement Experiment-2
NASA Technical Reports Server (NTRS)
Urban, David L.; Ruff, Gary A.; Greenberg, Paul S.; Fischer, David; Meyer, Marit; Mulholland, George; Yuan, Zeng-Guang; Bryg, Victoria; Cleary, Thomas; Yang, Jiann
2012-01-01
Results are presented from the Reflight of the Smoke Aerosol Measurement Experiment (SAME-2) which was conducted during Expedition 24 (July-September 2010). The reflight experiment built upon the results of the original flight during Expedition 15 by adding diagnostic measurements and expanding the test matrix. Five different materials representative of those found in spacecraft (Teflon, Kapton, cotton, silicone rubber and Pyrell) were heated to temperatures below the ignition point with conditions controlled to provide repeatable sample surface temperatures and air flow. The air flow past the sample during the heating period ranged from quiescent to 8 cm/s. The smoke was initially collected in an aging chamber to simulate the transport time from the smoke source to the detector. This effective transport time was varied by holding the smoke in the aging chamber for times ranging from 11 to 1800 s. Smoke particle samples were collected on Transmission Electron Microscope (TEM) grids for post-flight analysis. The TEM grids were analyzed to observe the particle morphology and size parameters. The diagnostics included a prototype two-moment smoke detector and three different measures of moments of the particle size distribution. These moment diagnostics were used to determine the particle number concentration (zeroth moment), the diameter concentration (first moment), and the mass concentration (third moment). These statistics were combined to determine the diameter of average mass and the count mean diameter and, by assuming a log-normal distribution, the geometric mean diameter and the geometric standard deviations can also be calculated. Overall the majority of the average smoke particle sizes were found to be in the 200 nm to 400 nm range with the quiescent cases producing some cases with substantially larger particles.
NASA Astrophysics Data System (ADS)
Naiman, Jill P.; Pillepich, Annalisa; Springel, Volker; Ramirez-Ruiz, Enrico; Torrey, Paul; Vogelsberger, Mark; Pakmor, Rüdiger; Nelson, Dylan; Marinacci, Federico; Hernquist, Lars; Weinberger, Rainer; Genel, Shy
2018-06-01
The distribution of elements in galaxies provides a wealth of information about their production sites and their subsequent mixing into the interstellar medium. Here we investigate the elemental distributions of stars in the IllustrisTNG simulations. We analyse the abundance ratios of magnesium and europium in Milky Way-like galaxies from the TNG100 simulation (stellar masses log (M⋆/M⊙) ˜ 9.7-11.2). Comparison of observed magnesium and europium for individual stars in the Milky Way with the stellar abundances in our more than 850 Milky Way-like galaxies provides stringent constraints on our chemical evolutionary methods. Here, we use the magnesium-to-iron ratio as a proxy for the effects of our SNII (core-collapse supernovae) and SNIa (Type Ia supernovae) metal return prescription and as a comparison to a variety of galactic observations. The europium-to-iron ratio tracks the rare ejecta from neutron star-neutron star mergers, the assumed primary site of europium production in our models, and is a sensitive probe of the effects of metal diffusion within the gas in our simulations. We find that europium abundances in Milky Way-like galaxies show no correlation with assembly history, present-day galactic properties, and average galactic stellar population age. We reproduce the europium-to-iron spread at low metallicities observed in the Milky Way, and find it is sensitive to gas properties during redshifts z ≈ 2-4. We show that while the overall normalization of [Eu/Fe] is susceptible to resolution and post-processing assumptions, the relatively large spread of [Eu/Fe] at low [Fe/H] when compared to that at high [Fe/H] is quite robust.
Rms-flux relation and fast optical variability simulations of the nova-like system MV Lyr
NASA Astrophysics Data System (ADS)
Dobrotka, A.; Mineshige, S.; Ness, J.-U.
2015-03-01
The stochastic variability (flickering) of the nova-like system (subclass of cataclysmic variable) MV Lyr yields a complicated power density spectrum with four break frequencies. Scaringi et al. analysed high-cadence Kepler data of MV Lyr, taken almost continuously over 600 d, giving the unique opportunity to study multicomponent Power Density Spectra (PDS) over a wide frequency range. We modelled this variability with our statistical model based on disc angular momentum transport via discrete turbulent bodies with an exponential distribution of the dimension scale. Two different models were used, a full disc (developed from the white dwarf to the outer radius of ˜1010 cm) and a radially thin disc (a ring at a distance of ˜1010 cm from the white dwarf) that imitates an outer disc rim. We succeed in explaining the two lowest observed break frequencies assuming typical values for a disc radius of 0.5 and 0.9 times the primary Roche lobe and an α parameter of 0.1-0.4. The highest observed break frequency was also modelled, but with a rather small accretion disc with a radius of 0.3 times the primary Roche lobe and a high α value of 0.9 consistent with previous findings by Scaringi. Furthermore, the simulated light curves exhibit the typical linear rms-flux proportionality linear relation and the typical log-normal flux distribution. As the turbulent process is generating fluctuations in mass accretion that propagate through the disc, this confirms the general knowledge that the typical rms-flux relation is mainly generated by these fluctuations. In general, a higher rms is generated by a larger amount of superposed flares which is compatible with a higher mass accretion rate expressed by a larger flux.
Sieve analysis using the number of infecting pathogens.
Follmann, Dean; Huang, Chiung-Yu
2017-12-14
Assessment of vaccine efficacy as a function of the similarity of the infecting pathogen to the vaccine is an important scientific goal. Characterization of pathogen strains for which vaccine efficacy is low can increase understanding of the vaccine's mechanism of action and offer targets for vaccine improvement. Traditional sieve analysis estimates differential vaccine efficacy using a single identifiable pathogen for each subject. The similarity between this single entity and the vaccine immunogen is quantified, for example, by exact match or number of mismatched amino acids. With new technology, we can now obtain the actual count of genetically distinct pathogens that infect an individual. Let F be the number of distinct features of a species of pathogen. We assume a log-linear model for the expected number of infecting pathogens with feature "f," f=1,…,F. The model can be used directly in studies with passive surveillance of infections where the count of each type of pathogen is recorded at the end of some interval, or active surveillance where the time of infection is known. For active surveillance, we additionally assume that a proportional intensity model applies to the time of potentially infectious exposures and derive product and weighted estimating equation (WEE) estimators for the regression parameters in the log-linear model. The WEE estimator explicitly allows for waning vaccine efficacy and time-varying distributions of pathogens. We give conditions where sieve parameters have a per-exposure interpretation under passive surveillance. We evaluate the methods by simulation and analyze a phase III trial of a malaria vaccine. © 2017, The International Biometric Society.
A Bootstrap Algorithm for Mixture Models and Interval Data in Inter-Comparisons
2001-07-01
parametric bootstrap. The present algorithm will be applied to a thermometric inter-comparison, where data cannot be assumed to be normally distributed. 2 Data...experimental methods, used in each laboratory) often imply that the statistical assumptions are not satisfied, as for example in several thermometric ...triangular). Indeed, in thermometric experiments these three probabilistic models can represent several common stochastic variabilities for
Leveraging the Cloud for Integrated Network Experimentation
2014-03-01
kernel settings, or any of the low-level subcomponents. 3. Scalable Solutions: Businesses can build scalable solutions for their clients , ranging from...values. These values 13 can assume several distributions that include normal, Pareto , uniform, exponential and Poisson, among others [21]. Additionally, D...communication, the web client establishes a connection to the server before traffic begins to flow. Web servers do not initiate connections to clients in
Huang, Cheng-Yen; Hsieh, Ming-Ching; Zhou, Qinwei
2017-04-01
Monoclonal antibodies have become the fastest growing protein therapeutics in recent years. The stability and heterogeneity pertaining to its physical and chemical structures remain a big challenge. Tryptophan fluorescence has been proven to be a versatile tool to monitor protein tertiary structure. By modeling the tryptophan fluorescence emission envelope with log-normal distribution curves, the quantitative measure can be exercised for the routine characterization of monoclonal antibody overall tertiary structure. Furthermore, the log-normal deconvolution results can be presented as a two-dimensional plot with tryptophan emission bandwidth vs. emission maximum to enhance the resolution when comparing samples or as a function of applied perturbations. We demonstrate this by studying four different monoclonal antibodies, which show the distinction on emission bandwidth-maximum plot despite their similarity in overall amino acid sequences and tertiary structures. This strategy is also used to demonstrate the tertiary structure comparability between different lots manufactured for one of the monoclonal antibodies (mAb2). In addition, in the unfolding transition studies of mAb2 as a function of guanidine hydrochloride concentration, the evolution of the tertiary structure can be clearly traced in the emission bandwidth-maximum plot.
NASA Technical Reports Server (NTRS)
Reschke, Millard F.; Somers, Jeffrey T.; Feiveson, Alan H.; Leigh, R. John; Wood, Scott J.; Paloski, William H.; Kornilova, Ludmila
2006-01-01
We studied the ability to hold the eyes in eccentric horizontal or vertical gaze angles in 68 normal humans, age range 19-56. Subjects attempted to sustain visual fixation of a briefly flashed target located 30 in the horizontal plane and 15 in the vertical plane in a dark environment. Conventionally, the ability to hold eccentric gaze is estimated by fitting centripetal eye drifts by exponential curves and calculating the time constant (t(sub c)) of these slow phases of gazeevoked nystagmus. Although the distribution of time-constant measurements (t(sub c)) in our normal subjects was extremely skewed due to occasional test runs that exhibited near-perfect stability (large t(sub c) values), we found that log10(tc) was approximately normally distributed within classes of target direction. Therefore, statistical estimation and inference on the effect of target direction was performed on values of z identical with log10t(sub c). Subjects showed considerable variation in their eyedrift performance over repeated trials; nonetheless, statistically significant differences emerged: values of tc were significantly higher for gaze elicited to targets in the horizontal plane than for the vertical plane (P less than 10(exp -5), suggesting eccentric gazeholding is more stable in the horizontal than in the vertical plane. Furthermore, centrifugal eye drifts were observed in 13.3, 16.0 and 55.6% of cases for horizontal, upgaze and downgaze tests, respectively. Fifth percentile values of the time constant were estimated to be 10.2 sec, 3.3 sec and 3.8 sec for horizontal, upward and downward gaze, respectively. The difference between horizontal and vertical gazeholding may be ascribed to separate components of the velocity position neural integrator for eye movements, and to differences in orbital mechanics. Our statistical method for representing the range of normal eccentric gaze stability can be readily applied in a clinical setting to patients who were exposed to environments that may have modified their central integrators and thus require monitoring. Patients with gaze-evoked nystagmus can be flagged by comparing to the above established normative criteria.
Retention for Stoploss reinsurance to minimize VaR in compound Poisson-Lognormal distribution
NASA Astrophysics Data System (ADS)
Soleh, Achmad Zanbar; Noviyanti, Lienda; Nurrahmawati, Irma
2015-12-01
Automobile insurance is one of the emerging general insurance's product in Indonesia. Fluctuation in total premium revenues and total claim expenses leads to a risk that insurance company can not be able to pay consumer's claims, thus reinsurance is needeed. Reinsurance is a risk transfer mechanism from the insurance company to another company called reinsurer, one of the reinsurance type is Stoploss. Because reinsurer charges premium to the insurance company, it is important to determine the retention or the total claims to be retain solely by the insurance company. Thus, retention is determined using Value at Risk (VaR) which minimize the total risk of the insurance company in the presence of Stoploss reinsurance. Retention depends only on the distribution of total claims and reinsurance loading factor. We use the compound Poisson distribution and the Log-Normal Distribution to illustrate the retention value in a collective risk model.
Do wealth distributions follow power laws? Evidence from ‘rich lists’
NASA Astrophysics Data System (ADS)
Brzezinski, Michal
2014-07-01
We use data on the wealth of the richest persons taken from the 'rich lists' provided by business magazines like Forbes to verify if the upper tails of wealth distributions follow, as often claimed, a power-law behaviour. The data sets used cover the world's richest persons over 1996-2012, the richest Americans over 1988-2012, the richest Chinese over 2006-2012, and the richest Russians over 2004-2011. Using a recently introduced comprehensive empirical methodology for detecting power laws, which allows for testing the goodness of fit as well as for comparing the power-law model with rival distributions, we find that a power-law model is consistent with data only in 35% of the analysed data sets. Moreover, even if wealth data are consistent with the power-law model, they are usually also consistent with some rivals like the log-normal or stretched exponential distributions.
Statistics of backscatter radar return from vegetation
NASA Technical Reports Server (NTRS)
Karam, M. A.; Chen, K. S.; Fung, A. K.
1992-01-01
The statistical characteristics of radar return from vegetation targets are investigated through a simulation study based upon the first-order scattered field. For simulation purposes, the vegetation targets are modeled as a layer of randomly oriented and spaced finite cylinders, needles, or discs, or a combination of them. The finite cylinder is used to represent a branch or a trunk, the needle for a stem or a coniferous leaf, and the disc for a decidous leaf. For a plane wave illuminating a vegetation canopy, simulation results show that the signal returned from a layer of disc- or needle-shaped leaves follows the Gamma distribution, and that the signal returned from a layer of branches resembles the log normal distribution. The Gamma distribution also represents the signal returned from a layer of a mixture of branches and leaves regardless of the leaf shapes. Results also indicate that the polarization state does not have a significant impact on signal distribution.
The application of the sinusoidal model to lung cancer patient respiratory motion
DOE Office of Scientific and Technical Information (OSTI.GOV)
George, R.; Vedam, S.S.; Chung, T.D.
2005-09-15
Accurate modeling of the respiratory cycle is important to account for the effect of organ motion on dose calculation for lung cancer patients. The aim of this study is to evaluate the accuracy of a respiratory model for lung cancer patients. Lujan et al. [Med. Phys. 26(5), 715-720 (1999)] proposed a model, which became widely used, to describe organ motion due to respiration. This model assumes that the parameters do not vary between and within breathing cycles. In this study, first, the correlation of respiratory motion traces with the model f(t) as a function of the parameter n(n=1,2,3) was undertakenmore » for each breathing cycle from 331 four-minute respiratory traces acquired from 24 lung cancer patients using three breathing types: free breathing, audio instruction, and audio-visual biofeedback. Because cos{sup 2} and cos{sup 4} had similar correlation coefficients, and cos{sup 2} and cos{sup 1} have a trigonometric relationship, for simplicity, the cos{sup 1} value was consequently used for further analysis in which the variations in mean position (z{sub 0}), amplitude of motion (b) and period ({tau}) with and without biofeedback or instructions were investigated. For all breathing types, the parameter values, mean position (z{sub 0}), amplitude of motion (b), and period ({tau}) exhibited significant cycle-to-cycle variations. Audio-visual biofeedback showed the least variations for all three parameters (z{sub 0}, b, and {tau}). It was found that mean position (z{sub 0}) could be approximated with a normal distribution, and the amplitude of motion (b) and period ({tau}) could be approximated with log normal distributions. The overall probability density function (pdf) of f(t) for each of the three breathing types was fitted with three models: normal, bimodal, and the pdf of a simple harmonic oscillator. It was found that the normal and the bimodal models represented the overall respiratory motion pdfs with correlation values from 0.95 to 0.99, whereas the range of the simple harmonic oscillator pdf correlation values was 0.71 to 0.81. This study demonstrates that the pdfs of mean position (z{sub 0}), amplitude of motion (b), and period ({tau}) can be used for sampling to obtain more realistic respiratory traces. The overall standard deviations of respiratory motion were 0.48, 0.57, and 0.55 cm for free breathing, audio instruction, and audio-visual biofeedback, respectively.« less
Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T
2015-01-01
Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839
Are CO Observations of Interstellar Clouds Tracing the H2?
NASA Astrophysics Data System (ADS)
Federrath, Christoph; Glover, S. C. O.; Klessen, R. S.; Mac Low, M.
2010-01-01
Interstellar clouds are commonly observed through the emission of rotational transitions from carbon monoxide (CO). However, the abundance ratio of CO to molecular hydrogen (H2), which is the most abundant molecule in molecular clouds is only about 10-4. This raises the important question of whether the observed CO emission is actually tracing the bulk of the gas in these clouds, and whether it can be used to derive quantities like the total mass of the cloud, the gas density distribution function, the fractal dimension, and the velocity dispersion--size relation. To evaluate the usability and accuracy of CO as a tracer for H2 gas, we generate synthetic observations of hydrodynamical models that include a detailed chemical network to follow the formation and photo-dissociation of H2 and CO. These three-dimensional models of turbulent interstellar cloud formation self-consistently follow the coupled thermal, dynamical and chemical evolution of 32 species, with a particular focus on H2 and CO (Glover et al. 2009). We find that CO primarily traces the dense gas in the clouds, however, with a significant scatter due to turbulent mixing and self-shielding of H2 and CO. The H2 probability distribution function (PDF) is well-described by a log-normal distribution. In contrast, the CO column density PDF has a strongly non-Gaussian low-density wing, not at all consistent with a log-normal distribution. Centroid velocity statistics show that CO is more intermittent than H2, leading to an overestimate of the velocity scaling exponent in the velocity dispersion--size relation. With our systematic comparison of H2 and CO data from the numerical models, we hope to provide a statistical formula to correct for the bias of CO observations. CF acknowledges financial support from a Kade Fellowship of the American Museum of Natural History.
Log-Linear Models for Gene Association
Hu, Jianhua; Joshi, Adarsh; Johnson, Valen E.
2009-01-01
We describe a class of log-linear models for the detection of interactions in high-dimensional genomic data. This class of models leads to a Bayesian model selection algorithm that can be applied to data that have been reduced to contingency tables using ranks of observations within subjects, and discretization of these ranks within gene/network components. Many normalization issues associated with the analysis of genomic data are thereby avoided. A prior density based on Ewens’ sampling distribution is used to restrict the number of interacting components assigned high posterior probability, and the calculation of posterior model probabilities is expedited by approximations based on the likelihood ratio statistic. Simulation studies are used to evaluate the efficiency of the resulting algorithm for known interaction structures. Finally, the algorithm is validated in a microarray study for which it was possible to obtain biological confirmation of detected interactions. PMID:19655032
Scaling of global input-output networks
NASA Astrophysics Data System (ADS)
Liang, Sai; Qi, Zhengling; Qu, Shen; Zhu, Ji; Chiu, Anthony S. F.; Jia, Xiaoping; Xu, Ming
2016-06-01
Examining scaling patterns of networks can help understand how structural features relate to the behavior of the networks. Input-output networks consist of industries as nodes and inter-industrial exchanges of products as links. Previous studies consider limited measures for node strengths and link weights, and also ignore the impact of dataset choice. We consider a comprehensive set of indicators in this study that are important in economic analysis, and also examine the impact of dataset choice, by studying input-output networks in individual countries and the entire world. Results show that Burr, Log-Logistic, Log-normal, and Weibull distributions can better describe scaling patterns of global input-output networks. We also find that dataset choice has limited impacts on the observed scaling patterns. Our findings can help examine the quality of economic statistics, estimate missing data in economic statistics, and identify key nodes and links in input-output networks to support economic policymaking.
Demonstration of a Low Cost, High-Speed Fiber Optic Transceiver
2002-09-01
200 610 (2) 800 600 (3) Diameter of cable(s) (mm) 0.125 3 7 (4) 100×10 (5) Weight (5 m cable, kg) (6) 0.008 0.1 0.51 0.5 Reliability ( MTTF hrs...Based on 1E7 hour MTTF number from Honeywell preliminary data sheet (8) Based on 12 VCSELs, log-normal distribution, σ = 0.225 Technical...A009 on Form DD 1423-1 Optical Link for Radar Digital Processor Andrew Davidson, Terri L. Dooley, Grant R. Emmel, Robert A. Marsland, and
Binary data corruption due to a Brownian agent
NASA Astrophysics Data System (ADS)
Newman, T. J.; Triampo, Wannapong
1999-05-01
We introduce a model of binary data corruption induced by a Brownian agent (active random walker) on a d-dimensional lattice. A continuum formulation allows the exact calculation of several quantities related to the density of corrupted bits ρ, for example, the mean of ρ and the density-density correlation function. Excellent agreement is found with the results from numerical simulations. We also calculate the probability distribution of ρ in d=1, which is found to be log normal, indicating that the system is governed by extreme fluctuations.
Deadwood biomass: an underestimated carbon stock in degraded tropical forests?
NASA Astrophysics Data System (ADS)
Pfeifer, Marion; Lefebvre, Veronique; Turner, Edgar; Cusack, Jeremy; Khoo, MinSheng; Chey, Vun K.; Peni, Maria; Ewers, Robert M.
2015-04-01
Despite a large increase in the area of selectively logged tropical forest worldwide, the carbon stored in deadwood across a tropical forest degradation gradient at the landscape scale remains poorly documented. Many carbon stock studies have either focused exclusively on live standing biomass or have been carried out in primary forests that are unaffected by logging, despite the fact that coarse woody debris (deadwood with ≥10 cm diameter) can contain significant portions of a forest’s carbon stock. We used a field-based assessment to quantify how the relative contribution of deadwood to total above-ground carbon stock changes across a disturbance gradient, from unlogged old-growth forest to severely degraded twice-logged forest, to oil palm plantation. We measured in 193 vegetation plots (25 × 25 m), equating to a survey area of >12 ha of tropical humid forest located within the Stability of Altered Forest Ecosystems Project area, in Sabah, Malaysia. Our results indicate that significant amounts of carbon are stored in deadwood across forest stands. Live tree carbon storage decreased exponentially with increasing forest degradation 7-10 years after logging while deadwood accounted for >50% of above-ground carbon stocks in salvage-logged forest stands, more than twice the proportion commonly assumed in the literature. This carbon will be released as decomposition proceeds. Given the high rates of deforestation and degradation presently occurring in Southeast Asia, our findings have important implications for the calculation of current carbon stocks and sources as a result of human-modification of tropical forests. Assuming similar patterns are prevalent throughout the tropics, our data may indicate a significant global challenge to calculating global carbon fluxes, as selectively-logged forests now represent more than one third of all standing tropical humid forests worldwide.
Calculating tissue shear modulus and pressure by 2D log-elastographic methods
NASA Astrophysics Data System (ADS)
McLaughlin, Joyce R.; Zhang, Ning; Manduca, Armando
2010-08-01
Shear modulus imaging, often called elastography, enables detection and characterization of tissue abnormalities. In this paper the data are two displacement components obtained from successive MR or ultrasound data sets acquired while the tissue is excited mechanically. A 2D plane strain elastic model is assumed to govern the 2D displacement, u. The shear modulus, μ, is unknown and whether or not the first Lamé parameter, λ, is known the pressure p = λ∇ sdot u which is present in the plane strain model cannot be measured and is unreliably computed from measured data and can be shown to be an order one quantity in the units kPa. So here we present a 2D log-elastographic inverse algorithm that (1) simultaneously reconstructs the shear modulus, μ, and p, which together satisfy a first-order partial differential equation system, with the goal of imaging μ (2) controls potential exponential growth in the numerical error and (3) reliably reconstructs the quantity p in the inverse algorithm as compared to the same quantity computed with a forward algorithm. This work generalizes the log-elastographic algorithm in Lin et al (2009 Inverse Problems 25) which uses one displacement component, is derived assuming that the component satisfies the wave equation and is tested on synthetic data computed with the wave equation model. The 2D log-elastographic algorithm is tested on 2D synthetic data and 2D in vivo data from Mayo Clinic. We also exhibit examples to show that the 2D log-elastographic algorithm improves the quality of the recovered images as compared to the log-elastographic and direct inversion algorithms.