NASA Technical Reports Server (NTRS)
Chao, Luen-Yuan; Shetty, Dinesh K.
1992-01-01
Statistical analysis and correlation between pore-size distribution and fracture strength distribution using the theory of extreme-value statistics is presented for a sintered silicon nitride. The pore-size distribution on a polished surface of this material was characterized, using an automatic optical image analyzer. The distribution measured on the two-dimensional plane surface was transformed to a population (volume) distribution, using the Schwartz-Saltykov diameter method. The population pore-size distribution and the distribution of the pore size at the fracture origin were correllated by extreme-value statistics. Fracture strength distribution was then predicted from the extreme-value pore-size distribution, usin a linear elastic fracture mechanics model of annular crack around pore and the fracture toughness of the ceramic. The predicted strength distribution was in good agreement with strength measurements in bending. In particular, the extreme-value statistics analysis explained the nonlinear trend in the linearized Weibull plot of measured strengths without postulating a lower-bound strength.
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1982-01-01
The first absolute rain fade distribution method described establishes absolute fade statistics at a given site by means of a sampled radar data base. The second method extrapolates absolute fade statistics from one location to another, given simultaneously measured fade and rain rate statistics at the former. Both methods employ similar conditional fade statistic concepts and long term rain rate distributions. Probability deviations in the 2-19% range, with an 11% average, were obtained upon comparison of measured and predicted levels at given attenuations. The extrapolation of fade distributions to other locations at 28 GHz showed very good agreement with measured data at three sites located in the continental temperate region.
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Austin, Peter C; Steyerberg, Ewout W
2012-06-20
When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.
Limitations of the Porter-Thomas distribution
NASA Astrophysics Data System (ADS)
Weidenmüller, Hans A.
2017-12-01
Data on the distribution of reduced partial neutron widths and on the distribution of total gamma decay widths disagree with the Porter-Thomas distribution (PTD) for reduced partial widths or with predictions of the statistical model. We recall why the disagreement is important: The PTD is a direct consequence of the orthogonal invariance of the Gaussian Orthogonal Ensemble (GOE) of random matrices. The disagreement is reviewed. Two possible causes for violation of orthogonal invariance of the GOE are discussed, and their consequences explored. The disagreement of the distribution of total gamma decay widths with theoretical predictions cannot be blamed on the statistical model.
Statistical distribution of mechanical properties for three graphite-epoxy material systems
NASA Technical Reports Server (NTRS)
Reese, C.; Sorem, J., Jr.
1981-01-01
Graphite-epoxy composites are playing an increasing role as viable alternative materials in structural applications necessitating thorough investigation into the predictability and reproducibility of their material strength properties. This investigation was concerned with tension, compression, and short beam shear coupon testing of large samples from three different material suppliers to determine their statistical strength behavior. Statistical results indicate that a two Parameter Weibull distribution model provides better overall characterization of material behavior for the graphite-epoxy systems tested than does the standard Normal distribution model that is employed for most design work. While either a Weibull or Normal distribution model provides adequate predictions for average strength values, the Weibull model provides better characterization in the lower tail region where the predictions are of maximum design interest. The two sets of the same material were found to have essentially the same material properties, and indicate that repeatability can be achieved.
2012-01-01
Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998
The Effect of General Statistical Fiber Misalignment on Predicted Damage Initiation in Composites
NASA Technical Reports Server (NTRS)
Bednarcyk, Brett A.; Aboudi, Jacob; Arnold, Steven M.
2014-01-01
A micromechanical method is employed for the prediction of unidirectional composites in which the fiber orientation can possess various statistical misalignment distributions. The method relies on the probability-weighted averaging of the appropriate concentration tensor, which is established by the micromechanical procedure. This approach provides access to the local field quantities throughout the constituents, from which initiation of damage in the composite can be predicted. In contrast, a typical macromechanical procedure can determine the effective composite elastic properties in the presence of statistical fiber misalignment, but cannot provide the local fields. Fully random fiber distribution is presented as a special case using the proposed micromechanical method. Results are given that illustrate the effects of various amounts of fiber misalignment in terms of the standard deviations of in-plane and out-of-plane misalignment angles, where normal distributions have been employed. Damage initiation envelopes, local fields, effective moduli, and strengths are predicted for polymer and ceramic matrix composites with given normal distributions of misalignment angles, as well as fully random fiber orientation.
Tomáš Václavík; Ross K. Meentemeyer
2009-01-01
Species distribution models (SDMs) based on statistical relationships between occurrence data and underlying environmental conditions are increasingly used to predict spatial patterns of biological invasions and prioritize locations for early detection and control of invasion outbreaks. However, invasive species distribution models (iSDMs) face special challenges...
Statistical prediction with Kanerva's sparse distributed memory
NASA Technical Reports Server (NTRS)
Rogers, David
1989-01-01
A new viewpoint of the processing performed by Kanerva's sparse distributed memory (SDM) is presented. In conditions of near- or over-capacity, where the associative-memory behavior of the model breaks down, the processing performed by the model can be interpreted as that of a statistical predictor. Mathematical results are presented which serve as the framework for a new statistical viewpoint of sparse distributed memory and for which the standard formulation of SDM is a special case. This viewpoint suggests possible enhancements to the SDM model, including a procedure for improving the predictiveness of the system based on Holland's work with genetic algorithms, and a method for improving the capacity of SDM even when used as an associative memory.
NASA Astrophysics Data System (ADS)
Garrido, Marta Isabel; Teng, Chee Leong James; Taylor, Jeremy Alexander; Rowe, Elise Genevieve; Mattingley, Jason Brett
2016-06-01
The ability to learn about regularities in the environment and to make predictions about future events is fundamental for adaptive behaviour. We have previously shown that people can implicitly encode statistical regularities and detect violations therein, as reflected in neuronal responses to unpredictable events that carry a unique prediction error signature. In the real world, however, learning about regularities will often occur in the context of competing cognitive demands. Here we asked whether learning of statistical regularities is modulated by concurrent cognitive load. We compared electroencephalographic metrics associated with responses to pure-tone sounds with frequencies sampled from narrow or wide Gaussian distributions. We showed that outliers evoked a larger response than those in the centre of the stimulus distribution (i.e., an effect of surprise) and that this difference was greater for physically identical outliers in the narrow than in the broad distribution. These results demonstrate an early neurophysiological marker of the brain's ability to implicitly encode complex statistical structure in the environment. Moreover, we manipulated concurrent cognitive load by having participants perform a visual working memory task while listening to these streams of sounds. We again observed greater prediction error responses in the narrower distribution under both low and high cognitive load. Furthermore, there was no reliable reduction in prediction error magnitude under high-relative to low-cognitive load. Our findings suggest that statistical learning is not a capacity limited process, and that it proceeds automatically even when cognitive resources are taxed by concurrent demands.
A hybrid model for predicting carbon monoxide from vehicular exhausts in urban environments
NASA Astrophysics Data System (ADS)
Gokhale, Sharad; Khare, Mukesh
Several deterministic-based air quality models evaluate and predict the frequently occurring pollutant concentration well but, in general, are incapable of predicting the 'extreme' concentrations. In contrast, the statistical distribution models overcome the above limitation of the deterministic models and predict the 'extreme' concentrations. However, the environmental damages are caused by both extremes as well as by the sustained average concentration of pollutants. Hence, the model should predict not only 'extreme' ranges but also the 'middle' ranges of pollutant concentrations, i.e. the entire range. Hybrid modelling is one of the techniques that estimates/predicts the 'entire range' of the distribution of pollutant concentrations by combining the deterministic based models with suitable statistical distribution models ( Jakeman, et al., 1988). In the present paper, a hybrid model has been developed to predict the carbon monoxide (CO) concentration distributions at one of the traffic intersections, Income Tax Office (ITO), in the Delhi city, where the traffic is heterogeneous in nature and meteorology is 'tropical'. The model combines the general finite line source model (GFLSM) as its deterministic, and log logistic distribution (LLD) model, as its statistical components. The hybrid (GFLSM-LLD) model is then applied at the ITO intersection. The results show that the hybrid model predictions match with that of the observed CO concentration data within the 5-99 percentiles range. The model is further validated at different street location, i.e. Sirifort roadway. The validation results show that the model predicts CO concentrations fairly well ( d=0.91) in 10-95 percentiles range. The regulatory compliance is also developed to estimate the probability of exceedance of hourly CO concentration beyond the National Ambient Air Quality Standards (NAAQS) of India. It consists of light vehicles, heavy vehicles, three- wheelers (auto rickshaws) and two-wheelers (scooters, motorcycles, etc).
Analysis of thrips distribution: application of spatial statistics and Kriging
John Aleong; Bruce L. Parker; Margaret Skinner; Diantha Howard
1991-01-01
Kriging is a statistical technique that provides predictions for spatially and temporally correlated data. Observations of thrips distribution and density in Vermont soils are made in both space and time. Traditional statistical analysis of such data assumes that the counts taken over space and time are independent, which is not necessarily true. Therefore, to analyze...
Multiplicative Modeling of Children's Growth and Its Statistical Properties
NASA Astrophysics Data System (ADS)
Kuninaka, Hiroto; Matsushita, Mitsugu
2014-03-01
We develop a numerical growth model that can predict the statistical properties of the height distribution of Japanese children. Our previous studies have clarified that the height distribution of schoolchildren shows a transition from the lognormal distribution to the normal distribution during puberty. In this study, we demonstrate by simulation that the transition occurs owing to the variability of the onset of puberty.
Hao, Chen; LiJun, Chen; Albright, Thomas P.
2007-01-01
Invasive exotic species pose a growing threat to the economy, public health, and ecological integrity of nations worldwide. Explaining and predicting the spatial distribution of invasive exotic species is of great importance to prevention and early warning efforts. We are investigating the potential distribution of invasive exotic species, the environmental factors that influence these distributions, and the ability to predict them using statistical and information-theoretic approaches. For some species, detailed presence/absence occurrence data are available, allowing the use of a variety of standard statistical techniques. However, for most species, absence data are not available. Presented with the challenge of developing a model based on presence-only information, we developed an improved logistic regression approach using Information Theory and Frequency Statistics to produce a relative suitability map. This paper generated a variety of distributions of ragweed (Ambrosia artemisiifolia L.) from logistic regression models applied to herbarium specimen location data and a suite of GIS layers including climatic, topographic, and land cover information. Our logistic regression model was based on Akaike's Information Criterion (AIC) from a suite of ecologically reasonable predictor variables. Based on the results we provided a new Frequency Statistical method to compartmentalize habitat-suitability in the native range. Finally, we used the model and the compartmentalized criterion developed in native ranges to "project" a potential distribution onto the exotic ranges to build habitat-suitability maps. ?? Science in China Press 2007.
Using classification tree analysis to predict oak wilt distribution in Minnesota and Texas
Marla c. Downing; Vernon L. Thomas; Jennifer Juzwik; David N. Appel; Robin M. Reich; Kim Camilli
2008-01-01
We developed a methodology and compared results for predicting the potential distribution of Ceratocystis fagacearum (causal agent of oak wilt), in both Anoka County, MN, and Fort Hood, TX. The Potential Distribution of Oak Wilt (PDOW) utilizes a binary classification tree statistical technique that incorporates: geographical information systems (GIS...
NASA Astrophysics Data System (ADS)
Pernot, Pascal; Savin, Andreas
2018-06-01
Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.
Filter Tuning Using the Chi-Squared Statistic
NASA Technical Reports Server (NTRS)
Lilly-Salkowski, Tyler B.
2017-01-01
This paper examines the use of the Chi-square statistic as a means of evaluating filter performance. The goal of the process is to characterize the filter performance in the metric of covariance realism. The Chi-squared statistic is the value calculated to determine the realism of a covariance based on the prediction accuracy and the covariance values at a given point in time. Once calculated, it is the distribution of this statistic that provides insight on the accuracy of the covariance. The process of tuning an Extended Kalman Filter (EKF) for Aqua and Aura support is described, including examination of the measurement errors of available observation types, and methods of dealing with potentially volatile atmospheric drag modeling. Predictive accuracy and the distribution of the Chi-squared statistic, calculated from EKF solutions, are assessed.
Three Strategies for the Critical Use of Statistical Methods in Psychological Research
ERIC Educational Resources Information Center
Campitelli, Guillermo; Macbeth, Guillermo; Ospina, Raydonal; Marmolejo-Ramos, Fernando
2017-01-01
We present three strategies to replace the null hypothesis statistical significance testing approach in psychological research: (1) visual representation of cognitive processes and predictions, (2) visual representation of data distributions and choice of the appropriate distribution for analysis, and (3) model comparison. The three strategies…
Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D
2017-01-01
If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
Quantifying predictability in a model with statistical features of the atmosphere
Kleeman, Richard; Majda, Andrew J.; Timofeyev, Ilya
2002-01-01
The Galerkin truncated inviscid Burgers equation has recently been shown by the authors to be a simple model with many degrees of freedom, with many statistical properties similar to those occurring in dynamical systems relevant to the atmosphere. These properties include long time-correlated, large-scale modes of low frequency variability and short time-correlated “weather modes” at smaller scales. The correlation scaling in the model extends over several decades and may be explained by a simple theory. Here a thorough analysis of the nature of predictability in the idealized system is developed by using a theoretical framework developed by R.K. This analysis is based on a relative entropy functional that has been shown elsewhere by one of the authors to measure the utility of statistical predictions precisely. The analysis is facilitated by the fact that most relevant probability distributions are approximately Gaussian if the initial conditions are assumed to be so. Rather surprisingly this holds for both the equilibrium (climatological) and nonequilibrium (prediction) distributions. We find that in most cases the absolute difference in the first moments of these two distributions (the “signal” component) is the main determinant of predictive utility variations. Contrary to conventional belief in the ensemble prediction area, the dispersion of prediction ensembles is generally of secondary importance in accounting for variations in utility associated with different initial conditions. This conclusion has potentially important implications for practical weather prediction, where traditionally most attention has focused on dispersion and its variability. PMID:12429863
Heavy residues from very mass asymmetric heavy ion reactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hanold, Karl Alan
1994-08-01
The isotopic production cross sections and momenta of all residues with nuclear charge (Z) greater than 39 from the reaction of 26, 40, and 50 MeV/nucleon 129Xe + Be, C, and Al were measured. The isotopic cross sections, the momentum distribution for each isotope, and the cross section as a function of nuclear charge and momentum are presented here. The new cross sections are consistent with previous measurements of the cross sections from similar reaction systems. The shape of the cross section distribution, when considered as a function of Z and velocity, was found to be qualitatively consistent with thatmore » expected from an incomplete fusion reaction mechanism. An incomplete fusion model coupled to a statistical decay model is able to reproduce many features of these reactions: the shapes of the elemental cross section distributions, the emission velocity distributions for the intermediate mass fragments, and the Z versus velocity distributions. This model gives a less satisfactory prediction of the momentum distribution for each isotope. A very different model based on the Boltzman-Nordheim-Vlasov equation and which was also coupled to a statistical decay model reproduces many features of these reactions: the shapes of the elemental cross section distributions, the intermediate mass fragment emission velocity distributions, and the Z versus momentum distributions. Both model calculations over-estimate the average mass for each element by two mass units and underestimate the isotopic and isobaric widths of the experimental distributions. It is shown that the predicted average mass for each element can be brought into agreement with the data by small, but systematic, variation of the particle emission barriers used in the statistical model. The predicted isotopic and isobaric widths of the cross section distributions can not be brought into agreement with the experimental data using reasonable parameters for the statistical model.« less
The effects of estimation of censoring, truncation, transformation and partial data vectors
NASA Technical Reports Server (NTRS)
Hartley, H. O.; Smith, W. B.
1972-01-01
The purpose of this research was to attack statistical problems concerning the estimation of distributions for purposes of predicting and measuring assembly performance as it appears in biological and physical situations. Various statistical procedures were proposed to attack problems of this sort, that is, to produce the statistical distributions of the outcomes of biological and physical situations which, employ characteristics measured on constituent parts. The techniques are described.
On Predictability of System Anomalies in Real World
2011-08-01
distributed system SETI @home [44]. Different from the above work, this work focuses on quantifying the predictability of real-world system anomalies. V...J.-M. Vincent, and D. Anderson, “Mining for statistical models of availability in large-scale distributed systems: An empirical study of seti @home,” in Proc. of MASCOTS, sept. 2009.
Discriminative Random Field Models for Subsurface Contamination Uncertainty Quantification
NASA Astrophysics Data System (ADS)
Arshadi, M.; Abriola, L. M.; Miller, E. L.; De Paolis Kaluza, C.
2017-12-01
Application of flow and transport simulators for prediction of the release, entrapment, and persistence of dense non-aqueous phase liquids (DNAPLs) and associated contaminant plumes is a computationally intensive process that requires specification of a large number of material properties and hydrologic/chemical parameters. Given its computational burden, this direct simulation approach is particularly ill-suited for quantifying both the expected performance and uncertainty associated with candidate remediation strategies under real field conditions. Prediction uncertainties primarily arise from limited information about contaminant mass distributions, as well as the spatial distribution of subsurface hydrologic properties. Application of direct simulation to quantify uncertainty would, thus, typically require simulating multiphase flow and transport for a large number of permeability and release scenarios to collect statistics associated with remedial effectiveness, a computationally prohibitive process. The primary objective of this work is to develop and demonstrate a methodology that employs measured field data to produce equi-probable stochastic representations of a subsurface source zone that capture the spatial distribution and uncertainty associated with key features that control remediation performance (i.e., permeability and contamination mass). Here we employ probabilistic models known as discriminative random fields (DRFs) to synthesize stochastic realizations of initial mass distributions consistent with known, and typically limited, site characterization data. Using a limited number of full scale simulations as training data, a statistical model is developed for predicting the distribution of contaminant mass (e.g., DNAPL saturation and aqueous concentration) across a heterogeneous domain. Monte-Carlo sampling methods are then employed, in conjunction with the trained statistical model, to generate realizations conditioned on measured borehole data. Performance of the statistical model is illustrated through comparisons of generated realizations with the `true' numerical simulations. Finally, we demonstrate how these realizations can be used to determine statistically optimal locations for further interrogation of the subsurface.
Jet Measurements for Development of Jet Noise Prediction Tools
NASA Technical Reports Server (NTRS)
Bridges, James E.
2006-01-01
The primary focus of my presentation is the development of the jet noise prediction code JeNo with most examples coming from the experimental work that drove the theoretical development and validation. JeNo is a statistical jet noise prediction code, based upon the Lilley acoustic analogy. Our approach uses time-average 2-D or 3-D mean and turbulent statistics of the flow as input. The output is source distributions and spectral directivity.
Multi-year slant path rain fade statistics at 28.56 and 19.04 GHz for Wallops Island, Virginia
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1979-01-01
Multiyear rain fade statistics at 28.56 GHz and 19.04 GHz were compiled for the region of Wallops Island, Virginia covering the time periods, 1 April 1977 through 31 March 1978, and 1 September 1978 through 31 August 1979. The 28.56 GHz attenuations were derived by monitoring the beacon signals from the COMSTAR geosynchronous satellite, D sub 2 during the first year, and satellite, D sub 3, during the second year. Although 19.04 GHz beacons exist aboard these satellites, statistics at this frequency were predicted using the 28 GHz fade data, the measured rain rate distribution, and effective path length concepts. The prediction method used was tested against radar derived fade distributions and excellent comparisons were noted. For example, the rms deviations between the predicted and test distributions were less than or equal to 0.2dB or 4% at 19.04 GHz. The average ratio between the 28.56 GHz and 19.04 GHz fades were also derived for equal percentages of time resulting in a factor of 2.1 with a .05 standard deviation.
Willits, Jon A.; Seidenberg, Mark S.; Saffran, Jenny R.
2014-01-01
What makes some words easy for infants to recognize, and other words difficult? We addressed this issue in the context of prior results suggesting that infants have difficulty recognizing verbs relative to nouns. In this work, we highlight the role played by the distributional contexts in which nouns and verbs occur. Distributional statistics predict that English nouns should generally be easier to recognize than verbs in fluent speech. However, there are situations in which distributional statistics provide similar support for verbs. The statistics for verbs that occur with the English morpheme –ing, for example, should facilitate verb recognition. In two experiments with 7.5- and 9.5-month-old infants, we tested the importance of distributional statistics for word recognition by varying the frequency of the contextual frames in which verbs occur. The results support the conclusion that distributional statistics are utilized by infant language learners and contribute to noun–verb differences in word recognition. PMID:24908342
Statistical procedures for evaluating daily and monthly hydrologic model predictions
Coffey, M.E.; Workman, S.R.; Taraba, J.L.; Fogle, A.W.
2004-01-01
The overall study objective was to evaluate the applicability of different qualitative and quantitative methods for comparing daily and monthly SWAT computer model hydrologic streamflow predictions to observed data, and to recommend statistical methods for use in future model evaluations. Statistical methods were tested using daily streamflows and monthly equivalent runoff depths. The statistical techniques included linear regression, Nash-Sutcliffe efficiency, nonparametric tests, t-test, objective functions, autocorrelation, and cross-correlation. None of the methods specifically applied to the non-normal distribution and dependence between data points for the daily predicted and observed data. Of the tested methods, median objective functions, sign test, autocorrelation, and cross-correlation were most applicable for the daily data. The robust coefficient of determination (CD*) and robust modeling efficiency (EF*) objective functions were the preferred methods for daily model results due to the ease of comparing these values with a fixed ideal reference value of one. Predicted and observed monthly totals were more normally distributed, and there was less dependence between individual monthly totals than was observed for the corresponding predicted and observed daily values. More statistical methods were available for comparing SWAT model-predicted and observed monthly totals. The 1995 monthly SWAT model predictions and observed data had a regression Rr2 of 0.70, a Nash-Sutcliffe efficiency of 0.41, and the t-test failed to reject the equal data means hypothesis. The Nash-Sutcliffe coefficient and the R r2 coefficient were the preferred methods for monthly results due to the ability to compare these coefficients to a set ideal value of one.
Fienen, Michael N.; Selbig, William R.
2012-01-01
A new sample collection system was developed to improve the representation of sediment entrained in urban storm water by integrating water quality samples from the entire water column. The depth-integrated sampler arm (DISA) was able to mitigate sediment stratification bias in storm water, thereby improving the characterization of suspended-sediment concentration and particle size distribution at three independent study locations. Use of the DISA decreased variability, which improved statistical regression to predict particle size distribution using surrogate environmental parameters, such as precipitation depth and intensity. The performance of this statistical modeling technique was compared to results using traditional fixed-point sampling methods and was found to perform better. When environmental parameters can be used to predict particle size distributions, environmental managers have more options when characterizing concentrations, loads, and particle size distributions in urban runoff.
Constraints on the near-Earth asteroid obliquity distribution from the Yarkovsky effect
NASA Astrophysics Data System (ADS)
Tardioli, C.; Farnocchia, D.; Rozitis, B.; Cotto-Figueroa, D.; Chesley, S. R.; Statler, T. S.; Vasile, M.
2017-12-01
Aims: From light curve and radar data we know the spin axis of only 43 near-Earth asteroids. In this paper we attempt to constrain the spin axis obliquity distribution of near-Earth asteroids by leveraging the Yarkovsky effect and its dependence on an asteroid's obliquity. Methods: By modeling the physical parameters driving the Yarkovsky effect, we solve an inverse problem where we test different simple parametric obliquity distributions. Each distribution results in a predicted Yarkovsky effect distribution that we compare with a χ2 test to a dataset of 125 Yarkovsky estimates. Results: We find different obliquity distributions that are statistically satisfactory. In particular, among the considered models, the best-fit solution is a quadratic function, which only depends on two parameters, favors extreme obliquities consistent with the expected outcomes from the YORP effect, has a 2:1 ratio between retrograde and direct rotators, which is in agreement with theoretical predictions, and is statistically consistent with the distribution of known spin axes of near-Earth asteroids.
Optimal Predictions in Everyday Cognition: The Wisdom of Individuals or Crowds?
ERIC Educational Resources Information Center
Mozer, Michael C.; Pashler, Harold; Homaei, Hadjar
2008-01-01
Griffiths and Tenenbaum (2006) asked individuals to make predictions about the duration or extent of everyday events (e.g., cake baking times), and reported that predictions were optimal, employing Bayesian inference based on veridical prior distributions. Although the predictions conformed strikingly to statistics of the world, they reflect…
Lingner, Thomas; Kataya, Amr R. A.; Reumann, Sigrun
2012-01-01
We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences.1 As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity.” Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals. PMID:22415050
Lingner, Thomas; Kataya, Amr R A; Reumann, Sigrun
2012-02-01
We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences. As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity." Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals.
Fukaya, Keiichi; Kawamori, Ai; Osada, Yutaka; Kitazawa, Masumi; Ishiguro, Makio
2017-09-20
Women's basal body temperature (BBT) shows a periodic pattern that associates with menstrual cycle. Although this fact suggests a possibility that daily BBT time series can be useful for estimating the underlying phase state as well as for predicting the length of current menstrual cycle, little attention has been paid to model BBT time series. In this study, we propose a state-space model that involves the menstrual phase as a latent state variable to explain the daily fluctuation of BBT and the menstruation cycle length. Conditional distributions of the phase are obtained by using sequential Bayesian filtering techniques. A predictive distribution of the next menstruation day can be derived based on this conditional distribution and the model, leading to a novel statistical framework that provides a sequentially updated prediction for upcoming menstruation day. We applied this framework to a real data set of women's BBT and menstruation days and compared prediction accuracy of the proposed method with that of previous methods, showing that the proposed method generally provides a better prediction. Because BBT can be obtained with relatively small cost and effort, the proposed method can be useful for women's health management. Potential extensions of this framework as the basis of modeling and predicting events that are associated with the menstrual cycles are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
NASA Technical Reports Server (NTRS)
Matney, Mark
2011-01-01
A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, material, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. Because this information is used in making policy and engineering decisions, it is important that these assumptions be tested using empirical data. This study uses the latest database of known uncontrolled reentry locations measured by the United States Department of Defense. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors in the final stages of reentry - including the effects of gravitational harmonics, the effects of the Earth s equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and possibly change the probability of reentering over a given location. In this paper, the measured latitude and longitude distributions of these objects are directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz
2005-01-01
We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...
NASA Astrophysics Data System (ADS)
Miled, Karim; Limam, Oualid; Sab, Karam
2012-06-01
To predict aggregates' size distribution effect on the concrete compressive strength, a probabilistic mechanical model is proposed. Within this model, a Voronoi tessellation of a set of non-overlapping and rigid spherical aggregates is used to describe the concrete microstructure. Moreover, aggregates' diameters are defined as statistical variables and their size distribution function is identified to the experimental sieve curve. Then, an inter-aggregate failure criterion is proposed to describe the compressive-shear crushing of the hardened cement paste when concrete is subjected to uniaxial compression. Using a homogenization approach based on statistical homogenization and on geometrical simplifications, an analytical formula predicting the concrete compressive strength is obtained. This formula highlights the effects of cement paste strength and aggregates' size distribution and volume fraction on the concrete compressive strength. According to the proposed model, increasing the concrete strength for the same cement paste and the same aggregates' volume fraction is obtained by decreasing both aggregates' maximum size and the percentage of coarse aggregates. Finally, the validity of the model has been discussed through a comparison with experimental results (15 concrete compressive strengths ranging between 46 and 106 MPa) taken from literature and showing a good agreement with the model predictions.
NASA Astrophysics Data System (ADS)
Eckert, Nicolas; Schläppy, Romain; Jomelli, Vincent; Naaim, Mohamed
2013-04-01
A crucial step for proposing relevant long-term mitigation measures in long term avalanche forecasting is the accurate definition of high return period avalanches. Recently, "statistical-dynamical" approach combining a numerical model with stochastic operators describing the variability of its inputs-outputs have emerged. Their main interests is to take into account the topographic dependency of snow avalanche runout distances, and to constrain the correlation structure between model's variables by physical rules, so as to simulate the different marginal distributions of interest (pressure, flow depth, etc.) with a reasonable realism. Bayesian methods have been shown to be well adapted to achieve model inference, getting rid of identifiability problems thanks to prior information. An important problem which has virtually never been considered before is the validation of the predictions resulting from a statistical-dynamical approach (or from any other engineering method for computing extreme avalanches). In hydrology, independent "fossil" data such as flood deposits in caves are sometimes confronted to design discharges corresponding to high return periods. Hence, the aim of this work is to implement a similar comparison between high return period avalanches obtained with a statistical-dynamical approach and independent validation data resulting from careful dendrogeomorphological reconstructions. To do so, an up-to-date statistical model based on the depth-averaged equations and the classical Voellmy friction law is used on a well-documented case study. First, parameter values resulting from another path are applied, and the dendrological validation sample shows that this approach fails in providing realistic prediction for the case study. This may be due to the strongly bounded behaviour of runouts in this case (the extreme of their distribution is identified as belonging to the Weibull attraction domain). Second, local calibration on the available avalanche chronicle is performed with various prior distributions resulting from expert knowledge and/or other paths. For all calibrations, a very successful convergence is obtained, which confirms the robustness of the used Metropolis-Hastings estimation algorithm. This also demonstrates the interest of the Bayesian framework for aggregating information by sequential assimilation in the frequently encountered case of limited data quantity. Confrontation with the dendrological sample stresses the predominant role of the Coulombian friction coefficient distribution's variance on predicted high magnitude runouts. The optimal fit is obtained for a strong prior reflecting the local bounded behavior, and results in a 10-40 m difference for return periods ranging between 10 and 300 years. Implementing predictive simulations shows that this is largely within the range of magnitude of uncertainties to be taken into account. On the other hand, the different priors tested for the turbulent friction coefficient influence predictive performances only slightly, but have a large influence on predicted velocity and flow depth distributions. This all may be of high interest to refine calibration and predictive use of the statistical-dynamical model for any engineering application.
Understanding baseball team standings and streaks
NASA Astrophysics Data System (ADS)
Sire, C.; Redner, S.
2009-02-01
Can one understand the statistics of wins and losses of baseball teams? Are their consecutive-game winning and losing streaks self-reinforcing or can they be described statistically? We apply the Bradley-Terry model, which incorporates the heterogeneity of team strengths in a minimalist way, to answer these questions. Excellent agreement is found between the predictions of the Bradley-Terry model and the rank dependence of the average number team wins and losses in major-league baseball over the past century when the distribution of team strengths is taken to be uniformly distributed over a finite range. Using this uniform strength distribution, we also find very good agreement between model predictions and the observed distribution of consecutive-game team winning and losing streaks over the last half-century; however, the agreement is less good for the previous half-century. The behavior of the last half-century supports the hypothesis that long streaks are primarily statistical in origin with little self-reinforcing component. The data further show that the past half-century of baseball has been more competitive than the preceding half-century.
1986-12-01
Force and other branches of the military are placing an increased emphasis on system reliablity and maintainability. In studying current systems ...used in the research of proposed systems , by predicting MTTF and MTTR of the new parts and thus, predict the reliability of those parts. The statistics...effectiveness of new systems . Aitchison’s book on the lognormal distribution, printed and used by Cambridge University, highlighted the distributions
Prediction of slant path rain attenuation statistics at various locations
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1977-01-01
The paper describes a method for predicting slant path attenuation statistics at arbitrary locations for variable frequencies and path elevation angles. The method involves the use of median reflectivity factor-height profiles measured with radar as well as the use of long-term point rain rate data and assumed or measured drop size distributions. The attenuation coefficient due to cloud liquid water in the presence of rain is also considered. Absolute probability fade distributions are compared for eight cases: Maryland (15 GHz), Texas (30 GHz), Slough, England (19 and 37 GHz), Fayetteville, North Carolina (13 and 18 GHz), and Cambridge, Massachusetts (13 and 18 GHz).
Performance prediction of a synchronization link for distributed aerospace wireless systems.
Wang, Wen-Qin; Shao, Huaizong
2013-01-01
For reasons of stealth and other operational advantages, distributed aerospace wireless systems have received much attention in recent years. In a distributed aerospace wireless system, since the transmitter and receiver placed on separated platforms which use independent master oscillators, there is no cancellation of low-frequency phase noise as in the monostatic cases. Thus, high accurate time and frequency synchronization techniques are required for distributed wireless systems. The use of a dedicated synchronization link to quantify and compensate oscillator frequency instability is investigated in this paper. With the mathematical statistical models of phase noise, closed-form analytic expressions for the synchronization link performance are derived. The possible error contributions including oscillator, phase-locked loop, and receiver noise are quantified. The link synchronization performance is predicted by utilizing the knowledge of the statistical models, system error contributions, and sampling considerations. Simulation results show that effective synchronization error compensation can be achieved by using this dedicated synchronization link.
Non-extensive Statistics to the Cosmological Lithium Problem
NASA Astrophysics Data System (ADS)
Hou, S. Q.; He, J. J.; Parikh, A.; Kahl, D.; Bertulani, C. A.; Kajino, T.; Mathews, G. J.; Zhao, G.
2017-01-01
Big Bang nucleosynthesis (BBN) theory predicts the abundances of the light elements D, 3He, 4He, and 7Li produced in the early universe. The primordial abundances of D and 4He inferred from observational data are in good agreement with predictions, however, BBN theory overestimates the primordial 7Li abundance by about a factor of three. This is the so-called “cosmological lithium problem.” Solutions to this problem using conventional astrophysics and nuclear physics have not been successful over the past few decades, probably indicating the presence of new physics during the era of BBN. We have investigated the impact on BBN predictions of adopting a generalized distribution to describe the velocities of nucleons in the framework of Tsallis non-extensive statistics. This generalized velocity distribution is characterized by a parameter q, and reduces to the usually assumed Maxwell-Boltzmann distribution for q = 1. We find excellent agreement between predicted and observed primordial abundances of D, 4He, and 7Li for 1.069 ≤ q ≤ 1.082, suggesting a possible new solution to the cosmological lithium problem.
Prediction of the low-velocity distribution from the pore structure in simple porous media
NASA Astrophysics Data System (ADS)
de Anna, Pietro; Quaife, Bryan; Biros, George; Juanes, Ruben
2017-12-01
The macroscopic properties of fluid flow and transport through porous media are a direct consequence of the underlying pore structure. However, precise relations that characterize flow and transport from the statistics of pore-scale disorder have remained elusive. Here we investigate the relationship between pore structure and the resulting fluid flow and asymptotic transport behavior in two-dimensional geometries of nonoverlapping circular posts. We derive an analytical relationship between the pore throat size distribution fλ˜λ-β and the distribution of the low fluid velocities fu˜u-β /2 , based on a conceptual model of porelets (the flow established within each pore throat, here a Hagen-Poiseuille flow). Our model allows us to make predictions, within a continuous-time random-walk framework, for the asymptotic statistics of the spreading of fluid particles along their own trajectories. These predictions are confirmed by high-fidelity simulations of Stokes flow and advective transport. The proposed framework can be extended to other configurations which can be represented as a collection of known flow distributions.
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Prediction-based dynamic load-sharing heuristics
NASA Technical Reports Server (NTRS)
Goswami, Kumar K.; Devarakonda, Murthy; Iyer, Ravishankar K.
1993-01-01
The authors present dynamic load-sharing heuristics that use predicted resource requirements of processes to manage workloads in a distributed system. A previously developed statistical pattern-recognition method is employed for resource prediction. While nonprediction-based heuristics depend on a rapidly changing system status, the new heuristics depend on slowly changing program resource usage patterns. Furthermore, prediction-based heuristics can be more effective since they use future requirements rather than just the current system state. Four prediction-based heuristics, two centralized and two distributed, are presented. Using trace driven simulations, they are compared against random scheduling and two effective nonprediction based heuristics. Results show that the prediction-based centralized heuristics achieve up to 30 percent better response times than the nonprediction centralized heuristic, and that the prediction-based distributed heuristics achieve up to 50 percent improvements relative to their nonprediction counterpart.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests
Kosinski, Andrzej S.
2013-01-01
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
The coalescent process in models with selection and recombination.
Hudson, R R; Kaplan, N L
1988-11-01
The statistical properties of the process describing the genealogical history of a random sample of genes at a selectively neutral locus which is linked to a locus at which natural selection operates are investigated. It is found that the equations describing this process are simple modifications of the equations describing the process assuming that the two loci are completely linked. Thus, the statistical properties of the genealogical process for a random sample at a neutral locus linked to a locus with selection follow from the results obtained for the selected locus. Sequence data from the alcohol dehydrogenase (Adh) region of Drosophila melanogaster are examined and compared to predictions based on the theory. It is found that the spatial distribution of nucleotide differences between Fast and Slow alleles of Adh is very similar to the spatial distribution predicted if balancing selection operates to maintain the allozyme variation at the Adh locus. The spatial distribution of nucleotide differences between different Slow alleles of Adh do not match the predictions of this simple model very well.
Statistical analysis and modeling of intermittent transport events in the tokamak scrape-off layer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Johan, E-mail: anderson.johan@gmail.com; Halpern, Federico D.; Ricci, Paolo
The turbulence observed in the scrape-off-layer of a tokamak is often characterized by intermittent events of bursty nature, a feature which raises concerns about the prediction of heat loads on the physical boundaries of the device. It appears thus necessary to delve into the statistical properties of turbulent physical fields such as density, electrostatic potential, and temperature, focusing on the mathematical expression of tails of the probability distribution functions. The method followed here is to generate statistical information from time-traces of the plasma density stemming from Braginskii-type fluid simulations and check this against a first-principles theoretical model. The analysis ofmore » the numerical simulations indicates that the probability distribution function of the intermittent process contains strong exponential tails, as predicted by the analytical theory.« less
Bringing modeling to the masses: A web based system to predict potential species distributions
Graham, Jim; Newman, Greg; Kumar, Sunil; Jarnevich, Catherine S.; Young, Nick; Crall, Alycia W.; Stohlgren, Thomas J.; Evangelista, Paul
2010-01-01
Predicting current and potential species distributions and abundance is critical for managing invasive species, preserving threatened and endangered species, and conserving native species and habitats. Accurate predictive models are needed at local, regional, and national scales to guide field surveys, improve monitoring, and set priorities for conservation and restoration. Modeling capabilities, however, are often limited by access to software and environmental data required for predictions. To address these needs, we built a comprehensive web-based system that: (1) maintains a large database of field data; (2) provides access to field data and a wealth of environmental data; (3) accesses values in rasters representing environmental characteristics; (4) runs statistical spatial models; and (5) creates maps that predict the potential species distribution. The system is available online at www.niiss.org, and provides web-based tools for stakeholders to create potential species distribution models and maps under current and future climate scenarios.
Modelling the distribution of domestic ducks in Monsoon Asia
Van Bockel, Thomas P.; Prosser, Diann; Franceschini, Gianluca; Biradar, Chandra; Wint, William; Robinson, Tim; Gilbert, Marius
2011-01-01
Domestic ducks are considered to be an important reservoir of highly pathogenic avian influenza (HPAI), as shown by a number of geospatial studies in which they have been identified as a significant risk factor associated with disease presence. Despite their importance in HPAI epidemiology, their large-scale distribution in Monsoon Asia is poorly understood. In this study, we created a spatial database of domestic duck census data in Asia and used it to train statistical distribution models for domestic duck distributions at a spatial resolution of 1km. The method was based on a modelling framework used by the Food and Agriculture Organisation to produce the Gridded Livestock of the World (GLW) database, and relies on stratified regression models between domestic duck densities and a set of agro-ecological explanatory variables. We evaluated different ways of stratifying the analysis and of combining the prediction to optimize the goodness of fit of the predictions. We found that domestic duck density could be predicted with reasonable accuracy (mean RMSE and correlation coefficient between log-transformed observed and predicted densities being 0.58 and 0.80, respectively), using a stratification based on livestock production systems. We tested the use of artificially degraded data on duck distributions in Thailand and Vietnam as training data, and compared the modelled outputs with the original high-resolution data. This showed, for these two countries at least, that these approaches could be used to accurately disaggregate provincial level (administrative level 1) statistical data to provide high resolution model distributions.
NASA Technical Reports Server (NTRS)
Holms, A. G.
1982-01-01
A previous report described a backward deletion procedure of model selection that was optimized for minimum prediction error and which used a multiparameter combination of the F - distribution and an order statistics distribution of Cochran's. A computer program is described that applies the previously optimized procedure to real data. The use of the program is illustrated by examples.
ERIC Educational Resources Information Center
Watson, Jane
2007-01-01
Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…
Levine, Rebecca S; Peterson, A Townsend; Benedict, Mark Q
2004-02-01
The distribution of the Anopheles gambiae complex of malaria vectors in Africa is uncertain due to under-sampling of vast regions. We use ecologic niche modeling to predict the potential distribution of three members of the complex (A. gambiae, A. arabiensis, and A. quadriannulatus) and demonstrate the statistical significance of the models. Predictions correspond well to previous estimates, but provide detail regarding spatial discontinuities in the distribution of A. gambiae s.s. that are consistent with population genetic studies. Our predictions also identify large areas of Africa where the presence of A. arabiensis is predicted, but few specimens have been obtained, suggesting under-sampling of the species. Finally, we project models developed from African distribution data for the late 1900s into the past and to South America to determine retrospectively whether the deadly 1929 introduction of A. gambiae sensu lato into Brazil was more likely that of A. gambiae sensu stricto or A. arabiensis.
A gentle introduction to quantile regression for ecologists
Cade, B.S.; Noon, B.R.
2003-01-01
Quantile regression is a way to estimate the conditional quantiles of a response variable distribution in the linear model that provides a more complete view of possible causal relationships between variables in ecological processes. Typically, all the factors that affect ecological processes are not measured and included in the statistical models used to investigate relationships between variables associated with those processes. As a consequence, there may be a weak or no predictive relationship between the mean of the response variable (y) distribution and the measured predictive factors (X). Yet there may be stronger, useful predictive relationships with other parts of the response variable distribution. This primer relates quantile regression estimates to prediction intervals in parametric error distribution regression models (eg least squares), and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of the estimates for homogeneous and heterogeneous regression models.
Uncertainty propagation for statistical impact prediction of space debris
NASA Astrophysics Data System (ADS)
Hoogendoorn, R.; Mooij, E.; Geul, J.
2018-01-01
Predictions of the impact time and location of space debris in a decaying trajectory are highly influenced by uncertainties. The traditional Monte Carlo (MC) method can be used to perform accurate statistical impact predictions, but requires a large computational effort. A method is investigated that directly propagates a Probability Density Function (PDF) in time, which has the potential to obtain more accurate results with less computational effort. The decaying trajectory of Delta-K rocket stages was used to test the methods using a six degrees-of-freedom state model. The PDF of the state of the body was propagated in time to obtain impact-time distributions. This Direct PDF Propagation (DPP) method results in a multi-dimensional scattered dataset of the PDF of the state, which is highly challenging to process. No accurate results could be obtained, because of the structure of the DPP data and the high dimensionality. Therefore, the DPP method is less suitable for practical uncontrolled entry problems and the traditional MC method remains superior. Additionally, the MC method was used with two improved uncertainty models to obtain impact-time distributions, which were validated using observations of true impacts. For one of the two uncertainty models, statistically more valid impact-time distributions were obtained than in previous research.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2015-09-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2016-02-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.
Variability-aware compact modeling and statistical circuit validation on SRAM test array
NASA Astrophysics Data System (ADS)
Qiao, Ying; Spanos, Costas J.
2016-03-01
Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose a variability-aware compact model characterization methodology based on stepwise parameter selection. Transistor I-V measurements are obtained from bit transistor accessible SRAM test array fabricated using a collaborating foundry's 28nm FDSOI technology. Our in-house customized Monte Carlo simulation bench can incorporate these statistical compact models; and simulation results on SRAM writability performance are very close to measurements in distribution estimation. Our proposed statistical compact model parameter extraction methodology also has the potential of predicting non-Gaussian behavior in statistical circuit performances through mixtures of Gaussian distributions.
Performance of statistical models to predict mental health and substance abuse cost.
Montez-Rath, Maria; Christiansen, Cindy L; Ettner, Susan L; Loveland, Susan; Rosen, Amy K
2006-10-26
Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.
The Statistical Nature of Fatigue Crack Propagation
1977-03-01
LEVEL x - V AFFDL-TRt-T843 r THE STATISTICAL NATURE OF b FATIGUE CRACK PROPAGATION D. A. VIRKLER B. M. HILLBERR Y LL= P. K. GOEL C* SCHOOL...function of crack length was best represented by the three-parameter log-normal distribution. Six growth rate calculation methods were investigated and the...dN, which varied moderately as a function of crack length, replicate a vs. N data were predicted This predicted data reproduced the mean behavior but
NASA Technical Reports Server (NTRS)
Zemba, Michael; Nessel, James; Houts, Jacquelynne; Luini, Lorenzo; Riva, Carlo
2016-01-01
The rain rate data and statistics of a location are often used in conjunction with models to predict rain attenuation. However, the true attenuation is a function not only of rain rate, but also of the drop size distribution (DSD). Generally, models utilize an average drop size distribution (Laws and Parsons or Marshall and Palmer. However, individual rain events may deviate from these models significantly if their DSD is not well approximated by the average. Therefore, characterizing the relationship between the DSD and attenuation is valuable in improving modeled predictions of rain attenuation statistics. The DSD may also be used to derive the instantaneous frequency scaling factor and thus validate frequency scaling models. Since June of 2014, NASA Glenn Research Center (GRC) and the Politecnico di Milano (POLIMI) have jointly conducted a propagation study in Milan, Italy utilizing the 20 and 40 GHz beacon signals of the Alphasat TDP#5 Aldo Paraboni payload. The Ka- and Q-band beacon receivers provide a direct measurement of the signal attenuation while concurrent weather instrumentation provides measurements of the atmospheric conditions at the receiver. Among these instruments is a Thies Clima Laser Precipitation Monitor (optical disdrometer) which yields droplet size distributions (DSD); this DSD information can be used to derive a scaling factor that scales the measured 20 GHz data to expected 40 GHz attenuation. Given the capability to both predict and directly observe 40 GHz attenuation, this site is uniquely situated to assess and characterize such predictions. Previous work using this data has examined the relationship between the measured drop-size distribution and the measured attenuation of the link]. The focus of this paper now turns to a deeper analysis of the scaling factor, including the prediction error as a function of attenuation level, correlation between the scaling factor and the rain rate, and the temporal variability of the drop size distribution both within a given rain event and across different varieties of rain events. Index Terms-drop size distribution, frequency scaling, propagation losses, radiowave propagation.
NASA Technical Reports Server (NTRS)
Zemba, Michael; Nessel, James; Houts, Jacquelynne; Luini, Lorenzo; Riva, Carlo
2016-01-01
The rain rate data and statistics of a location are often used in conjunction with models to predict rain attenuation. However, the true attenuation is a function not only of rain rate, but also of the drop size distribution (DSD). Generally, models utilize an average drop size distribution (Laws and Parsons or Marshall and Palmer [1]). However, individual rain events may deviate from these models significantly if their DSD is not well approximated by the average. Therefore, characterizing the relationship between the DSD and attenuation is valuable in improving modeled predictions of rain attenuation statistics. The DSD may also be used to derive the instantaneous frequency scaling factor and thus validate frequency scaling models. Since June of 2014, NASA Glenn Research Center (GRC) and the Politecnico di Milano (POLIMI) have jointly conducted a propagation study in Milan, Italy utilizing the 20 and 40 GHz beacon signals of the Alphasat TDP#5 Aldo Paraboni payload. The Ka- and Q-band beacon receivers provide a direct measurement of the signal attenuation while concurrent weather instrumentation provides measurements of the atmospheric conditions at the receiver. Among these instruments is a Thies Clima Laser Precipitation Monitor (optical disdrometer) which yields droplet size distributions (DSD); this DSD information can be used to derive a scaling factor that scales the measured 20 GHz data to expected 40 GHz attenuation. Given the capability to both predict and directly observe 40 GHz attenuation, this site is uniquely situated to assess and characterize such predictions. Previous work using this data has examined the relationship between the measured drop-size distribution and the measured attenuation of the link [2]. The focus of this paper now turns to a deeper analysis of the scaling factor, including the prediction error as a function of attenuation level, correlation between the scaling factor and the rain rate, and the temporal variability of the drop size distribution both within a given rain event and across different varieties of rain events. Index Terms-drop size distribution, frequency scaling, propagation losses, radiowave propagation.
NON-EXTENSIVE STATISTICS TO THE COSMOLOGICAL LITHIUM PROBLEM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hou, S. Q.; He, J. J.; Parikh, A.
Big Bang nucleosynthesis (BBN) theory predicts the abundances of the light elements D, {sup 3}He, {sup 4}He, and {sup 7}Li produced in the early universe. The primordial abundances of D and {sup 4}He inferred from observational data are in good agreement with predictions, however, BBN theory overestimates the primordial {sup 7}Li abundance by about a factor of three. This is the so-called “cosmological lithium problem.” Solutions to this problem using conventional astrophysics and nuclear physics have not been successful over the past few decades, probably indicating the presence of new physics during the era of BBN. We have investigated themore » impact on BBN predictions of adopting a generalized distribution to describe the velocities of nucleons in the framework of Tsallis non-extensive statistics. This generalized velocity distribution is characterized by a parameter q , and reduces to the usually assumed Maxwell–Boltzmann distribution for q = 1. We find excellent agreement between predicted and observed primordial abundances of D, {sup 4}He, and {sup 7}Li for 1.069 ≤ q ≤ 1.082, suggesting a possible new solution to the cosmological lithium problem.« less
Coxen, Christopher L.; Frey, Jennifer K.; Carleton, Scott A.; Collins, Daniel P.
2017-01-01
Species distribution models can provide critical baseline distribution information for the conservation of poorly understood species. Here, we compared the performance of band-tailed pigeon (Patagioenas fasciata) species distribution models created using Maxent and derived from two separate presence-only occurrence data sources in New Mexico: 1) satellite tracked birds and 2) observations reported in eBird basic data set. Both models had good accuracy (test AUC > 0.8 and True Skill Statistic > 0.4), and high overlap between suitability scores (I statistic 0.786) and suitable habitat patches (relative rank 0.639). Our results suggest that, at the state-wide level, eBird occurrence data can effectively model similar species distributions as satellite tracking data. Climate change models for the band-tailed pigeon predict a 35% loss in area of suitable climate by 2070 if CO2 emissions drop to 1990 levels by 2100, and a 45% loss by 2070 if we continue current CO2 emission levels through the end of the century. These numbers may be conservative given the predicted increase in drought, wildfire, and forest pest impacts to the coniferous forests the species inhabits in New Mexico. The northern portion of the species’ range in New Mexico is predicted to be the most viable through time.
Using the Lorenz Curve to Characterize Risk Predictiveness and Etiologic Heterogeneity
Mauguen, Audrey; Begg, Colin B.
2017-01-01
The Lorenz curve is a graphical tool that is used widely in econometrics. It represents the spread of a probability distribution, and its traditional use has been to characterize population distributions of wealth or income, or more specifically, inequalities in wealth or income. However, its utility in public health research has not been broadly established. The purpose of this article is to explain its special usefulness for characterizing the population distribution of disease risks, and in particular for identifying the precise disease burden that can be predicted to occur in segments of the population that are known to have especially high (or low) risks, a feature that is important for evaluating the yield of screening or other disease prevention initiatives. We demonstrate that, although the Lorenz curve represents the distribution of predicted risks in a population at risk for the disease, in fact it can be estimated from a case–control study conducted in the population without the need for information on absolute risks. We explore two different estimation strategies and compare their statistical properties using simulations. The Lorenz curve is a statistical tool that deserves wider use in public health research. PMID:27096256
Syndromic surveillance models using Web data: the case of scarlet fever in the UK.
Samaras, Loukas; García-Barriocanal, Elena; Sicilia, Miguel-Angel
2012-03-01
Recent research has shown the potential of Web queries as a source for syndromic surveillance, and existing studies show that these queries can be used as a basis for estimation and prediction of the development of a syndromic disease, such as influenza, using log linear (logit) statistical models. Two alternative models are applied to the relationship between cases and Web queries in this paper. We examine the applicability of using statistical methods to relate search engine queries with scarlet fever cases in the UK, taking advantage of tools to acquire the appropriate data from Google, and using an alternative statistical method based on gamma distributions. The results show that using logit models, the Pearson correlation factor between Web queries and the data obtained from the official agencies must be over 0.90, otherwise the prediction of the peak and the spread of the distributions gives significant deviations. In this paper, we describe the gamma distribution model and show that we can obtain better results in all cases using gamma transformations, and especially in those with a smaller correlation factor.
Modeling and experiments of the adhesion force distribution between particles and a surface.
You, Siming; Wan, Man Pun
2014-06-17
Due to the existence of surface roughness in real surfaces, the adhesion force between particles and the surface where the particles are deposited exhibits certain statistical distributions. Despite the importance of adhesion force distribution in a variety of applications, the current understanding of modeling adhesion force distribution is still limited. In this work, an adhesion force distribution model based on integrating the root-mean-square (RMS) roughness distribution (i.e., the variation of RMS roughness on the surface in terms of location) into recently proposed mean adhesion force models was proposed. The integration was accomplished by statistical analysis and Monte Carlo simulation. A series of centrifuge experiments were conducted to measure the adhesion force distributions between polystyrene particles (146.1 ± 1.99 μm) and various substrates (stainless steel, aluminum and plastic, respectively). The proposed model was validated against the measured adhesion force distributions from this work and another previous study. Based on the proposed model, the effect of RMS roughness distribution on the adhesion force distribution of particles on a rough surface was explored, showing that both the median and standard deviation of adhesion force distribution could be affected by the RMS roughness distribution. The proposed model could predict both van der Waals force and capillary force distributions and consider the multiscale roughness feature, greatly extending the current capability of adhesion force distribution prediction.
Performance Prediction of a Synchronization Link for Distributed Aerospace Wireless Systems
Shao, Huaizong
2013-01-01
For reasons of stealth and other operational advantages, distributed aerospace wireless systems have received much attention in recent years. In a distributed aerospace wireless system, since the transmitter and receiver placed on separated platforms which use independent master oscillators, there is no cancellation of low-frequency phase noise as in the monostatic cases. Thus, high accurate time and frequency synchronization techniques are required for distributed wireless systems. The use of a dedicated synchronization link to quantify and compensate oscillator frequency instability is investigated in this paper. With the mathematical statistical models of phase noise, closed-form analytic expressions for the synchronization link performance are derived. The possible error contributions including oscillator, phase-locked loop, and receiver noise are quantified. The link synchronization performance is predicted by utilizing the knowledge of the statistical models, system error contributions, and sampling considerations. Simulation results show that effective synchronization error compensation can be achieved by using this dedicated synchronization link. PMID:23970828
Modeling Statistical Insensitivity: Sources of Suboptimal Behavior
ERIC Educational Resources Information Center
Gagliardi, Annie; Feldman, Naomi H.; Lidz, Jeffrey
2017-01-01
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the…
NASA Technical Reports Server (NTRS)
Ricks, Trenton M.; Lacy, Thomas E., Jr.; Pineda, Evan J.; Bednarcyk, Brett A.; Arnold, Steven M.
2013-01-01
A multiscale modeling methodology, which incorporates a statistical distribution of fiber strengths into coupled micromechanics/ finite element analyses, is applied to unidirectional polymer matrix composites (PMCs) to analyze the effect of mesh discretization both at the micro- and macroscales on the predicted ultimate tensile (UTS) strength and failure behavior. The NASA code FEAMAC and the ABAQUS finite element solver were used to analyze the progressive failure of a PMC tensile specimen that initiates at the repeating unit cell (RUC) level. Three different finite element mesh densities were employed and each coupled with an appropriate RUC. Multiple simulations were performed in order to assess the effect of a statistical distribution of fiber strengths on the bulk composite failure and predicted strength. The coupled effects of both the micro- and macroscale discretizations were found to have a noticeable effect on the predicted UTS and computational efficiency of the simulations.
Statistical Approaches for Spatiotemporal Prediction of Low Flows
NASA Astrophysics Data System (ADS)
Fangmann, A.; Haberlandt, U.
2017-12-01
An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be problematic. Spatiotemporal prediction of L-moments appeared highly uncertain for higher-order moments resulting in unrealistic future low flow values. All in all, the results promote an inclusion of simple statistical methods in climate change impact assessment.
Regional climate model downscaling may improve the prediction of alien plant species distributions
NASA Astrophysics Data System (ADS)
Liu, Shuyan; Liang, Xin-Zhong; Gao, Wei; Stohlgren, Thomas J.
2014-12-01
Distributions of invasive species are commonly predicted with species distribution models that build upon the statistical relationships between observed species presence data and climate data. We used field observations, climate station data, and Maximum Entropy species distribution models for 13 invasive plant species in the United States, and then compared the models with inputs from a General Circulation Model (hereafter GCM-based models) and a downscaled Regional Climate Model (hereafter, RCM-based models).We also compared species distributions based on either GCM-based or RCM-based models for the present (1990-1999) to the future (2046-2055). RCM-based species distribution models replicated observed distributions remarkably better than GCM-based models for all invasive species under the current climate. This was shown for the presence locations of the species, and by using four common statistical metrics to compare modeled distributions. For two widespread invasive taxa ( Bromus tectorum or cheatgrass, and Tamarix spp. or tamarisk), GCM-based models failed miserably to reproduce observed species distributions. In contrast, RCM-based species distribution models closely matched observations. Future species distributions may be significantly affected by using GCM-based inputs. Because invasive plants species often show high resilience and low rates of local extinction, RCM-based species distribution models may perform better than GCM-based species distribution models for planning containment programs for invasive species.
Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields
NASA Astrophysics Data System (ADS)
Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.
2018-01-01
This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.
O'Connell, Allan F.; Gardner, Beth; Oppel, Steffen; Meirinho, Ana; Ramírez, Iván; Miller, Peter I.; Louzao, Maite
2012-01-01
Knowledge about the spatial distribution of seabirds at sea is important for conservation. During marine conservation planning, logistical constraints preclude seabird surveys covering the complete area of interest and spatial distribution of seabirds is frequently inferred from predictive statistical models. Increasingly complex models are available to relate the distribution and abundance of pelagic seabirds to environmental variables, but a comparison of their usefulness for delineating protected areas for seabirds is lacking. Here we compare the performance of five modelling techniques (generalised linear models, generalised additive models, Random Forest, boosted regression trees, and maximum entropy) to predict the distribution of Balearic Shearwaters (Puffinus mauretanicus) along the coast of the western Iberian Peninsula. We used ship transect data from 2004 to 2009 and 13 environmental variables to predict occurrence and density, and evaluated predictive performance of all models using spatially segregated test data. Predicted distribution varied among the different models, although predictive performance varied little. An ensemble prediction that combined results from all five techniques was robust and confirmed the existence of marine important bird areas for Balearic Shearwaters in Portugal and Spain. Our predictions suggested additional areas that would be of high priority for conservation and could be proposed as protected areas. Abundance data were extremely difficult to predict, and none of five modelling techniques provided a reliable prediction of spatial patterns. We advocate the use of ensemble modelling that combines the output of several methods to predict the spatial distribution of seabirds, and use these predictions to target separate surveys assessing the abundance of seabirds in areas of regular use.
NASA Astrophysics Data System (ADS)
Selvam, A. M.
2017-01-01
Dynamical systems in nature exhibit self-similar fractal space-time fluctuations on all scales indicating long-range correlations and, therefore, the statistical normal distribution with implicit assumption of independence, fixed mean and standard deviation cannot be used for description and quantification of fractal data sets. The author has developed a general systems theory based on classical statistical physics for fractal fluctuations which predicts the following. (1) The fractal fluctuations signify an underlying eddy continuum, the larger eddies being the integrated mean of enclosed smaller-scale fluctuations. (2) The probability distribution of eddy amplitudes and the variance (square of eddy amplitude) spectrum of fractal fluctuations follow the universal Boltzmann inverse power law expressed as a function of the golden mean. (3) Fractal fluctuations are signatures of quantum-like chaos since the additive amplitudes of eddies when squared represent probability densities analogous to the sub-atomic dynamics of quantum systems such as the photon or electron. (4) The model predicted distribution is very close to statistical normal distribution for moderate events within two standard deviations from the mean but exhibits a fat long tail that are associated with hazardous extreme events. Continuous periodogram power spectral analyses of available GHCN annual total rainfall time series for the period 1900-2008 for Indian and USA stations show that the power spectra and the corresponding probability distributions follow model predicted universal inverse power law form signifying an eddy continuum structure underlying the observed inter-annual variability of rainfall. On a global scale, man-made greenhouse gas related atmospheric warming would result in intensification of natural climate variability, seen immediately in high frequency fluctuations such as QBO and ENSO and even shorter timescales. Model concepts and results of analyses are discussed with reference to possible prediction of climate change. Model concepts, if correct, rule out unambiguously, linear trends in climate. Climate change will only be manifested as increase or decrease in the natural variability. However, more stringent tests of model concepts and predictions are required before applications to such an important issue as climate change. Observations and simulations with climate models show that precipitation extremes intensify in response to a warming climate (O'Gorman in Curr Clim Change Rep 1:49-59, 2015).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mestrovic, Ante; Clark, Brenda G.; Department of Medical Physics, British Columbia Cancer Agency, Vancouver, British Columbia
2005-11-01
Purpose: To develop a method of predicting the values of dose distribution parameters of different radiosurgery techniques for treatment of arteriovenous malformation (AVM) based on internal geometric parameters. Methods and Materials: For each of 18 previously treated AVM patients, four treatment plans were created: circular collimator arcs, dynamic conformal arcs, fixed conformal fields, and intensity-modulated radiosurgery. An algorithm was developed to characterize the target and critical structure shape complexity and the position of the critical structures with respect to the target. Multiple regression was employed to establish the correlation between the internal geometric parameters and the dose distribution for differentmore » treatment techniques. The results from the model were applied to predict the dosimetric outcomes of different radiosurgery techniques and select the optimal radiosurgery technique for a number of AVM patients. Results: Several internal geometric parameters showing statistically significant correlation (p < 0.05) with the treatment planning results for each technique were identified. The target volume and the average minimum distance between the target and the critical structures were the most effective predictors for normal tissue dose distribution. The structure overlap volume with the target and the mean distance between the target and the critical structure were the most effective predictors for critical structure dose distribution. The predicted values of dose distribution parameters of different radiosurgery techniques were in close agreement with the original data. Conclusions: A statistical model has been described that successfully predicts the values of dose distribution parameters of different radiosurgery techniques and may be used to predetermine the optimal technique on a patient-to-patient basis.« less
Characterization of Inclusion Populations in Mn-Si Deoxidized Steel
NASA Astrophysics Data System (ADS)
García-Carbajal, Alfonso; Herrera-Trejo, Martín; Castro-Cedeño, Edgar-Ivan; Castro-Román, Manuel; Martinez-Enriquez, Arturo-Isaias
2017-12-01
Four plant heats of Mn-Si deoxidized steel were conducted to follow the evolution of the inclusion population through ladle furnace (LF) treatment and subsequent vacuum treatment (VT). The liquid steel was sampled, and the chemical composition and size distribution of the inclusion populations were characterized. The Gumbel generalized extreme-value (GEV) and generalized Pareto (GP) distributions were used for the statistical analysis of the inclusion size distributions. The inclusions found at the beginning of the LF treatment were mostly fully liquid SiO2-Al2O3-MnO inclusions, which then evolved into fully liquid SiO2-Al2O3-CaO-MgO and partly liquid SiO2-CaO-MgO-(Al2O3-MgO) inclusions detected at the end of the VT. The final fully liquid inclusions had a desirable chemical composition for plastic behavior in subsequent metallurgical operations. The GP distribution was found to be undesirable for statistical analysis. The GEV distribution approach led to shape parameter values different from the zero value hypothesized from the Gumbel distribution. According to the GEV approach, some of the final inclusion size distributions had statistically significant differences, whereas the Gumbel approach predicted no statistically significant differences. The heats were organized according to indicators of inclusion cleanliness and a statistical comparison of the size distributions.
Modeling and forecasting the distribution of Vibrio vulnificus in Chesapeake Bay
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, John M.; Rhodes, M.; Brown, C. W.
The aim is to construct statistical models to predict the presence, abundance and potential virulence of Vibrio vulnificus in surface waters. A variety of statistical techniques were used in concert to identify water quality parameters associated with V. vulnificus presence, abundance and virulence markers in the interest of developing strong predictive models for use in regional oceanographic modeling systems. A suite of models are provided to represent the best model fit and alternatives using environmental variables that allow them to be put to immediate use in current ecological forecasting efforts. Conclusions: Environmental parameters such as temperature, salinity and turbidity aremore » capable of accurately predicting abundance and distribution of V. vulnificus in Chesapeake Bay. Forcing these empirical models with output from ocean modeling systems allows for spatially explicit forecasts for up to 48 h in the future. This study uses one of the largest data sets compiled to model Vibrio in an estuary, enhances our understanding of environmental correlates with abundance, distribution and presence of potentially virulent strains and offers a method to forecast these pathogens that may be replicated in other regions.« less
Variety and volatility in financial markets
NASA Astrophysics Data System (ADS)
Lillo, Fabrizio; Mantegna, Rosario N.
2000-11-01
We study the price dynamics of stocks traded in a financial market by considering the statistical properties of both a single time series and an ensemble of stocks traded simultaneously. We use the n stocks traded on the New York Stock Exchange to form a statistical ensemble of daily stock returns. For each trading day of our database, we study the ensemble return distribution. We find that a typical ensemble return distribution exists in most of the trading days with the exception of crash and rally days and of the days following these extreme events. We analyze each ensemble return distribution by extracting its first two central moments. We observe that these moments fluctuate in time and are stochastic processes, themselves. We characterize the statistical properties of ensemble return distribution central moments by investigating their probability density functions and temporal correlation properties. In general, time-averaged and portfolio-averaged price returns have different statistical properties. We infer from these differences information about the relative strength of correlation between stocks and between different trading days. Last, we compare our empirical results with those predicted by the single-index model and we conclude that this simple model cannot explain the statistical properties of the second moment of the ensemble return distribution.
Transient statistics in stabilizing periodic orbits
NASA Astrophysics Data System (ADS)
Meucci, R.; Gadomski, W.; Ciofini, M.; Arecchi, F. T.
1995-11-01
The statistics of chaotic and periodic transient time intervals preceding the stabilization of a given periodic orbit have been experimentally studied in a CO2 laser with modulated losses, subjected to a small subharmonic perturbation. As predicted by the theory, an exponential tail has been found in the probability distribution of chaotic transients. Furthermore, a fine periodic structure in the distributions of the periodic transients, resulting from the interaction of the control signal and the local structure of the chaotic attractor, has been revealed.
Incorporating uncertainty in predictive species distribution modelling.
Beale, Colin M; Lennon, Jack J
2012-01-19
Motivated by the need to solve ecological problems (climate change, habitat fragmentation and biological invasions), there has been increasing interest in species distribution models (SDMs). Predictions from these models inform conservation policy, invasive species management and disease-control measures. However, predictions are subject to uncertainty, the degree and source of which is often unrecognized. Here, we review the SDM literature in the context of uncertainty, focusing on three main classes of SDM: niche-based models, demographic models and process-based models. We identify sources of uncertainty for each class and discuss how uncertainty can be minimized or included in the modelling process to give realistic measures of confidence around predictions. Because this has typically not been performed, we conclude that uncertainty in SDMs has often been underestimated and a false precision assigned to predictions of geographical distribution. We identify areas where development of new statistical tools will improve predictions from distribution models, notably the development of hierarchical models that link different types of distribution model and their attendant uncertainties across spatial scales. Finally, we discuss the need to develop more defensible methods for assessing predictive performance, quantifying model goodness-of-fit and for assessing the significance of model covariates.
Hierarchical spatial models for predicting pygmy rabbit distribution and relative abundance
Wilson, T.L.; Odei, J.B.; Hooten, M.B.; Edwards, T.C.
2010-01-01
Conservationists routinely use species distribution models to plan conservation, restoration and development actions, while ecologists use them to infer process from pattern. These models tend to work well for common or easily observable species, but are of limited utility for rare and cryptic species. This may be because honest accounting of known observation bias and spatial autocorrelation are rarely included, thereby limiting statistical inference of resulting distribution maps. We specified and implemented a spatially explicit Bayesian hierarchical model for a cryptic mammal species (pygmy rabbit Brachylagus idahoensis). Our approach used two levels of indirect sign that are naturally hierarchical (burrows and faecal pellets) to build a model that allows for inference on regression coefficients as well as spatially explicit model parameters. We also produced maps of rabbit distribution (occupied burrows) and relative abundance (number of burrows expected to be occupied by pygmy rabbits). The model demonstrated statistically rigorous spatial prediction by including spatial autocorrelation and measurement uncertainty. We demonstrated flexibility of our modelling framework by depicting probabilistic distribution predictions using different assumptions of pygmy rabbit habitat requirements. Spatial representations of the variance of posterior predictive distributions were obtained to evaluate heterogeneity in model fit across the spatial domain. Leave-one-out cross-validation was conducted to evaluate the overall model fit. Synthesis and applications. Our method draws on the strengths of previous work, thereby bridging and extending two active areas of ecological research: species distribution models and multi-state occupancy modelling. Our framework can be extended to encompass both larger extents and other species for which direct estimation of abundance is difficult. ?? 2010 The Authors. Journal compilation ?? 2010 British Ecological Society.
Mukhopadhyay, Nitai D; Sampson, Andrew J; Deniz, Daniel; Alm Carlsson, Gudrun; Williamson, Jeffrey; Malusek, Alexandr
2012-01-01
Correlated sampling Monte Carlo methods can shorten computing times in brachytherapy treatment planning. Monte Carlo efficiency is typically estimated via efficiency gain, defined as the reduction in computing time by correlated sampling relative to conventional Monte Carlo methods when equal statistical uncertainties have been achieved. The determination of the efficiency gain uncertainty arising from random effects, however, is not a straightforward task specially when the error distribution is non-normal. The purpose of this study is to evaluate the applicability of the F distribution and standardized uncertainty propagation methods (widely used in metrology to estimate uncertainty of physical measurements) for predicting confidence intervals about efficiency gain estimates derived from single Monte Carlo runs using fixed-collision correlated sampling in a simplified brachytherapy geometry. A bootstrap based algorithm was used to simulate the probability distribution of the efficiency gain estimates and the shortest 95% confidence interval was estimated from this distribution. It was found that the corresponding relative uncertainty was as large as 37% for this particular problem. The uncertainty propagation framework predicted confidence intervals reasonably well; however its main disadvantage was that uncertainties of input quantities had to be calculated in a separate run via a Monte Carlo method. The F distribution noticeably underestimated the confidence interval. These discrepancies were influenced by several photons with large statistical weights which made extremely large contributions to the scored absorbed dose difference. The mechanism of acquiring high statistical weights in the fixed-collision correlated sampling method was explained and a mitigation strategy was proposed. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Sergeenko, N. P.
2017-11-01
An adequate statistical method should be developed in order to predict probabilistically the range of ionospheric parameters. This problem is solved in this paper. The time series of the critical frequency of the layer F2- foF2( t) were subjected to statistical processing. For the obtained samples {δ foF2}, statistical distributions and invariants up to the fourth order are calculated. The analysis shows that the distributions differ from the Gaussian law during the disturbances. At levels of sufficiently small probability distributions, there are arbitrarily large deviations from the model of the normal process. Therefore, it is attempted to describe statistical samples {δ foF2} based on the Poisson model. For the studied samples, the exponential characteristic function is selected under the assumption that time series are a superposition of some deterministic and random processes. Using the Fourier transform, the characteristic function is transformed into a nonholomorphic excessive-asymmetric probability-density function. The statistical distributions of the samples {δ foF2} calculated for the disturbed periods are compared with the obtained model distribution function. According to the Kolmogorov's criterion, the probabilities of the coincidence of a posteriori distributions with the theoretical ones are P 0.7-0.9. The conducted analysis makes it possible to draw a conclusion about the applicability of a model based on the Poisson random process for the statistical description and probabilistic variation estimates during heliogeophysical disturbances of the variations {δ foF2}.
The statistical properties and possible causes of polar motion prediction errors
NASA Astrophysics Data System (ADS)
Kosek, Wieslaw; Kalarus, Maciej; Wnek, Agnieszka; Zbylut-Gorska, Maria
2015-08-01
The pole coordinate data predictions from different prediction contributors of the Earth Orientation Parameters Combination of Prediction Pilot Project (EOPCPPP) were studied to determine the statistical properties of polar motion forecasts by looking at the time series of differences between them and the future IERS pole coordinates data. The mean absolute errors, standard deviations as well as the skewness and kurtosis of these differences were computed together with their error bars as a function of prediction length. The ensemble predictions show a little smaller mean absolute errors or standard deviations however their skewness and kurtosis values are similar as the for predictions from different contributors. The skewness and kurtosis enable to check whether these prediction differences satisfy normal distribution. The kurtosis values diminish with the prediction length which means that the probability distribution of these prediction differences is becoming more platykurtic than letptokurtic. Non zero skewness values result from oscillating character of these differences for particular prediction lengths which can be due to the irregular change of the annual oscillation phase in the joint fluid (atmospheric + ocean + land hydrology) excitation functions. The variations of the annual oscillation phase computed by the combination of the Fourier transform band pass filter and the Hilbert transform from pole coordinates data as well as from pole coordinates model data obtained from fluid excitations are in a good agreement.
Vibration Response Models of a Stiffened Aluminum Plate Excited by a Shaker
NASA Technical Reports Server (NTRS)
Cabell, Randolph H.
2008-01-01
Numerical models of structural-acoustic interactions are of interest to aircraft designers and the space program. This paper describes a comparison between two energy finite element codes, a statistical energy analysis code, a structural finite element code, and the experimentally measured response of a stiffened aluminum plate excited by a shaker. Different methods for modeling the stiffeners and the power input from the shaker are discussed. The results show that the energy codes (energy finite element and statistical energy analysis) accurately predicted the measured mean square velocity of the plate. In addition, predictions from an energy finite element code had the best spatial correlation with measured velocities. However, predictions from a considerably simpler, single subsystem, statistical energy analysis model also correlated well with the spatial velocity distribution. The results highlight a need for further work to understand the relationship between modeling assumptions and the prediction results.
NASA Astrophysics Data System (ADS)
Zaichik, Leonid I.; Alipchenkov, Vladimir M.
2009-10-01
The purpose of this paper is twofold: (i) to advance and extend the statistical two-point models of pair dispersion and particle clustering in isotropic turbulence that were previously proposed by Zaichik and Alipchenkov (2003 Phys. Fluids15 1776-87 2007 Phys. Fluids 19, 113308) and (ii) to present some applications of these models. The models developed are based on a kinetic equation for the two-point probability density function of the relative velocity distribution of two particles. These models predict the pair relative velocity statistics and the preferential accumulation of heavy particles in stationary and decaying homogeneous isotropic turbulent flows. Moreover, the models are applied to predict the effect of particle clustering on turbulent collisions, sedimentation and intensity of microwave radiation as well as to calculate the mean filtered subgrid stress of the particulate phase. Model predictions are compared with direct numerical simulations and experimental measurements.
Managing distribution changes in time series prediction
NASA Astrophysics Data System (ADS)
Matias, J. M.; Gonzalez-Manteiga, W.; Taboada, J.; Ordonez, C.
2006-07-01
When a problem is modeled statistically, a single distribution model is usually postulated that is assumed to be valid for the entire space. Nonetheless, this practice may be somewhat unrealistic in certain application areas, in which the conditions of the process that generates the data may change; as far as we are aware, however, no techniques have been developed to tackle this problem.This article proposes a technique for modeling and predicting this change in time series with a view to improving estimates and predictions. The technique is applied, among other models, to the hypernormal distribution recently proposed. When tested on real data from a range of stock market indices the technique produces better results that when a single distribution model is assumed to be valid for the entire period of time studied.Moreover, when a global model is postulated, it is highly recommended to select the hypernormal distribution parameter in the same likelihood maximization process.
Modelling plant species distribution in alpine grasslands using airborne imaging spectroscopy
Pottier, Julien; Malenovský, Zbyněk; Psomas, Achilleas; Homolová, Lucie; Schaepman, Michael E.; Choler, Philippe; Thuiller, Wilfried; Guisan, Antoine; Zimmermann, Niklaus E.
2014-01-01
Remote sensing using airborne imaging spectroscopy (AIS) is known to retrieve fundamental optical properties of ecosystems. However, the value of these properties for predicting plant species distribution remains unclear. Here, we assess whether such data can add value to topographic variables for predicting plant distributions in French and Swiss alpine grasslands. We fitted statistical models with high spectral and spatial resolution reflectance data and tested four optical indices sensitive to leaf chlorophyll content, leaf water content and leaf area index. We found moderate added-value of AIS data for predicting alpine plant species distribution. Contrary to expectations, differences between species distribution models (SDMs) were not linked to their local abundance or phylogenetic/functional similarity. Moreover, spectral signatures of species were found to be partly site-specific. We discuss current limits of AIS-based SDMs, highlighting issues of scale and informational content of AIS data. PMID:25079495
A probabilistic approach to photovoltaic generator performance prediction
NASA Astrophysics Data System (ADS)
Khallat, M. A.; Rahman, S.
1986-09-01
A method for predicting the performance of a photovoltaic (PV) generator based on long term climatological data and expected cell performance is described. The equations for cell model formulation are provided. Use of the statistical model for characterizing the insolation level is discussed. The insolation data is fitted to appropriate probability distribution functions (Weibull, beta, normal). The probability distribution functions are utilized to evaluate the capacity factors of PV panels or arrays. An example is presented revealing the applicability of the procedure.
Thematic and spatial resolutions affect model-based predictions of tree species distribution.
Liang, Yu; He, Hong S; Fraser, Jacob S; Wu, ZhiWei
2013-01-01
Subjective decisions of thematic and spatial resolutions in characterizing environmental heterogeneity may affect the characterizations of spatial pattern and the simulation of occurrence and rate of ecological processes, and in turn, model-based tree species distribution. Thus, this study quantified the importance of thematic and spatial resolutions, and their interaction in predictions of tree species distribution (quantified by species abundance). We investigated how model-predicted species abundances changed and whether tree species with different ecological traits (e.g., seed dispersal distance, competitive capacity) had different responses to varying thematic and spatial resolutions. We used the LANDIS forest landscape model to predict tree species distribution at the landscape scale and designed a series of scenarios with different thematic (different numbers of land types) and spatial resolutions combinations, and then statistically examined the differences of species abundance among these scenarios. Results showed that both thematic and spatial resolutions affected model-based predictions of species distribution, but thematic resolution had a greater effect. Species ecological traits affected the predictions. For species with moderate dispersal distance and relatively abundant seed sources, predicted abundance increased as thematic resolution increased. However, for species with long seeding distance or high shade tolerance, thematic resolution had an inverse effect on predicted abundance. When seed sources and dispersal distance were not limiting, the predicted species abundance increased with spatial resolution and vice versa. Results from this study may provide insights into the choice of thematic and spatial resolutions for model-based predictions of tree species distribution.
Thematic and Spatial Resolutions Affect Model-Based Predictions of Tree Species Distribution
Liang, Yu; He, Hong S.; Fraser, Jacob S.; Wu, ZhiWei
2013-01-01
Subjective decisions of thematic and spatial resolutions in characterizing environmental heterogeneity may affect the characterizations of spatial pattern and the simulation of occurrence and rate of ecological processes, and in turn, model-based tree species distribution. Thus, this study quantified the importance of thematic and spatial resolutions, and their interaction in predictions of tree species distribution (quantified by species abundance). We investigated how model-predicted species abundances changed and whether tree species with different ecological traits (e.g., seed dispersal distance, competitive capacity) had different responses to varying thematic and spatial resolutions. We used the LANDIS forest landscape model to predict tree species distribution at the landscape scale and designed a series of scenarios with different thematic (different numbers of land types) and spatial resolutions combinations, and then statistically examined the differences of species abundance among these scenarios. Results showed that both thematic and spatial resolutions affected model-based predictions of species distribution, but thematic resolution had a greater effect. Species ecological traits affected the predictions. For species with moderate dispersal distance and relatively abundant seed sources, predicted abundance increased as thematic resolution increased. However, for species with long seeding distance or high shade tolerance, thematic resolution had an inverse effect on predicted abundance. When seed sources and dispersal distance were not limiting, the predicted species abundance increased with spatial resolution and vice versa. Results from this study may provide insights into the choice of thematic and spatial resolutions for model-based predictions of tree species distribution. PMID:23861828
The application of the statistical theory of extreme values to gust-load problems
NASA Technical Reports Server (NTRS)
Press, Harry
1950-01-01
An analysis is presented which indicates that the statistical theory of extreme values is applicable to the problems of predicting the frequency of encountering the larger gust loads and gust velocities for both specific test conditions as well as commercial transport operations. The extreme-value theory provides an analytic form for the distributions of maximum values of gust load and velocity. Methods of fitting the distribution are given along with a method of estimating the reliability of the predictions. The theory of extreme values is applied to available load data from commercial transport operations. The results indicate that the estimates of the frequency of encountering the larger loads are more consistent with the data and more reliable than those obtained in previous analyses. (author)
Dynamic Modeling and Very Short-term Prediction of Wind Power Output Using Box-Cox Transformation
NASA Astrophysics Data System (ADS)
Urata, Kengo; Inoue, Masaki; Murayama, Dai; Adachi, Shuichi
2016-09-01
We propose a statistical modeling method of wind power output for very short-term prediction. The modeling method with a nonlinear model has cascade structure composed of two parts. One is a linear dynamic part that is driven by a Gaussian white noise and described by an autoregressive model. The other is a nonlinear static part that is driven by the output of the linear part. This nonlinear part is designed for output distribution matching: we shape the distribution of the model output to match with that of the wind power output. The constructed model is utilized for one-step ahead prediction of the wind power output. Furthermore, we study the relation between the prediction accuracy and the prediction horizon.
New advances in the statistical parton distributions approach
NASA Astrophysics Data System (ADS)
Soffer, Jacques; Bourrely, Claude
2016-03-01
The quantum statistical parton distributions approach proposed more than one decade ago is revisited by considering a larger set of recent and accurate Deep Inelastic Scattering experimental results. It enables us to improve the description of the data by means of a new determination of the parton distributions. This global next-to-leading order QCD analysis leads to a good description of several structure functions, involving unpolarized parton distributions and helicity distributions, in terms of a rather small number of free parameters. There are many serious challenging issues. The predictions of this theoretical approach will be tested for single-jet production and charge asymmetry in W± production in p¯p and pp collisions up to LHC energies, using recent data and also for forthcoming experimental results. Presented by J. So.er at POETIC 2015
Predicting future protection of respirator users: Statistical approaches and practical implications.
Hu, Chengcheng; Harber, Philip; Su, Jing
2016-01-01
The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.
Rain attenuation measurements: Variability and data quality assessment
NASA Technical Reports Server (NTRS)
Crane, Robert K.
1989-01-01
Year to year variations in the cumulative distributions of rain rate or rain attenuation are evident in any of the published measurements for a single propagation path that span a period of several years of observation. These variations must be described by models for the prediction of rain attenuation statistics. Now that a large measurement data base has been assembled by the International Radio Consultative Committee, the information needed to assess variability is available. On the basis of 252 sample cumulative distribution functions for the occurrence of attenuation by rain, the expected year to year variation in attenuation at a fixed probability level in the 0.1 to 0.001 percent of a year range is estimated to be 27 percent. The expected deviation from an attenuation model prediction for a single year of observations is estimated to exceed 33 percent when any of the available global rain climate model are employed to estimate the rain rate statistics. The probability distribution for the variation in attenuation or rain rate at a fixed fraction of a year is lognormal. The lognormal behavior of the variate was used to compile the statistics for variability.
Ipsen, Andreas
2017-02-03
Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ipsen, Andreas
Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less
Earthquake number forecasts testing
NASA Astrophysics Data System (ADS)
Kagan, Yan Y.
2017-10-01
We study the distributions of earthquake numbers in two global earthquake catalogues: Global Centroid-Moment Tensor and Preliminary Determinations of Epicenters. The properties of these distributions are especially required to develop the number test for our forecasts of future seismic activity rate, tested by the Collaboratory for Study of Earthquake Predictability (CSEP). A common assumption, as used in the CSEP tests, is that the numbers are described by the Poisson distribution. It is clear, however, that the Poisson assumption for the earthquake number distribution is incorrect, especially for the catalogues with a lower magnitude threshold. In contrast to the one-parameter Poisson distribution so widely used to describe earthquake occurrences, the negative-binomial distribution (NBD) has two parameters. The second parameter can be used to characterize the clustering or overdispersion of a process. We also introduce and study a more complex three-parameter beta negative-binomial distribution. We investigate the dependence of parameters for both Poisson and NBD distributions on the catalogue magnitude threshold and on temporal subdivision of catalogue duration. First, we study whether the Poisson law can be statistically rejected for various catalogue subdivisions. We find that for most cases of interest, the Poisson distribution can be shown to be rejected statistically at a high significance level in favour of the NBD. Thereafter, we investigate whether these distributions fit the observed distributions of seismicity. For this purpose, we study upper statistical moments of earthquake numbers (skewness and kurtosis) and compare them to the theoretical values for both distributions. Empirical values for the skewness and the kurtosis increase for the smaller magnitude threshold and increase with even greater intensity for small temporal subdivision of catalogues. The Poisson distribution for large rate values approaches the Gaussian law, therefore its skewness and kurtosis both tend to zero for large earthquake rates: for the Gaussian law, these values are identically zero. A calculation of the NBD skewness and kurtosis levels based on the values of the first two statistical moments of the distribution, shows rapid increase of these upper moments levels. However, the observed catalogue values of skewness and kurtosis are rising even faster. This means that for small time intervals, the earthquake number distribution is even more heavy-tailed than the NBD predicts. Therefore for small time intervals, we propose using empirical number distributions appropriately smoothed for testing forecasted earthquake numbers.
Predictive onboard flow control for packet switching satellites
NASA Technical Reports Server (NTRS)
Bobinsky, Eric A.
1992-01-01
We outline two alternate approaches to predicting the onset of congestion in a packet switching satellite, and argue that predictive, rather than reactive, flow control is necessary for the efficient operation of such a system. The first method discussed is based on standard, statistical techniques which are used to periodically calculate a probability of near-term congestion based on arrival rate statistics. If this probability exceeds a present threshold, the satellite would transmit a rate-reduction signal to all active ground stations. The second method discussed would utilize a neural network to periodically predict the occurrence of buffer overflow based on input data which would include, in addition to arrival rates, the distributions of packet lengths, source addresses, and destination addresses.
NASA Astrophysics Data System (ADS)
Hapca, Simona
2015-04-01
Many soil properties and functions emerge from interactions of physical, chemical and biological processes at microscopic scales, which can be understood only by integrating techniques that traditionally are developed within separate disciplines. While recent advances in imaging techniques, such as X-ray computed tomography (X-ray CT), offer the possibility to reconstruct the 3D physical structure at fine resolutions, for the distribution of chemicals in soil, existing methods, based on scanning electron microscope (SEM) and energy dispersive X-ray detection (EDX), allow for characterization of the chemical composition only on 2D surfaces. At present, direct 3D measurement techniques are still lacking, sequential sectioning of soils, followed by 2D mapping of chemical elements and interpolation to 3D, being an alternative which is explored in this study. Specifically, we develop an integrated experimental and theoretical framework which combines 3D X-ray CT imaging technique with 2D SEM-EDX and use spatial statistics methods to map the chemical composition of soil in 3D. The procedure involves three stages 1) scanning a resin impregnated soil cube by X-ray CT, followed by precision cutting to produce parallel thin slices, the surfaces of which are scanned by SEM-EDX, 2) alignment of the 2D chemical maps within the internal 3D structure of the soil cube, and 3) development, of spatial statistics methods to predict the chemical composition of 3D soil based on the observed 2D chemical and 3D physical data. Specifically, three statistical models consisting of a regression tree, a regression tree kriging and cokriging model were used to predict the 3D spatial distribution of carbon, silicon, iron and oxygen in soil, these chemical elements showing a good spatial agreement between the X-ray grayscale intensities and the corresponding 2D SEM-EDX data. Due to the spatial correlation between the physical and chemical data, the regression-tree model showed a great potential in predicting chemical composition in particular for iron, which is generally sparsely distributed in soil. For carbon, silicon and oxygen, which are more densely distributed, the additional kriging of the regression tree residuals improved significantly the prediction, whereas prediction based on co-kriging was less consistent across replicates, underperforming regression-tree kriging. The present study shows a great potential in integrating geo-statistical methods with imaging techniques to unveil the 3D chemical structure of soil at very fine scales, the framework being suitable to be further applied to other types of imaging data such as images of biological thin sections for characterization of microbial distribution. Key words: X-ray CT, SEM-EDX, segmentation techniques, spatial correlation, 3D soil images, 2D chemical maps.
Generalized statistical mechanics of cosmic rays: Application to positron-electron spectral indices.
Yalcin, G Cigdem; Beck, Christian
2018-01-29
Cosmic ray energy spectra exhibit power law distributions over many orders of magnitude that are very well described by the predictions of q-generalized statistical mechanics, based on a q-generalized Hagedorn theory for transverse momentum spectra and hard QCD scattering processes. QCD at largest center of mass energies predicts the entropic index to be [Formula: see text]. Here we show that the escort duality of the nonextensive thermodynamic formalism predicts an energy split of effective temperature given by Δ [Formula: see text] MeV, where T H is the Hagedorn temperature. We carefully analyse the measured data of the AMS-02 collaboration and provide evidence that the predicted temperature split is indeed observed, leading to a different energy dependence of the e + and e - spectral indices. We also observe a distinguished energy scale E * ≈ 50 GeV where the e + and e - spectral indices differ the most. Linear combinations of the escort and non-escort q-generalized canonical distributions yield excellent agreement with the measured AMS-02 data in the entire energy range.
Predicting species richness and distribution ranges of centipedes at the northern edge of Europe
NASA Astrophysics Data System (ADS)
Georgopoulou, Elisavet; Djursvoll, Per; Simaiakis, Stylianos M.
2016-07-01
In recent decades, interest in understanding species distributions and exploring processes that shape species diversity has increased, leading to the development of advanced methods for the exploitation of occurrence data for analytical and ecological purposes. Here, with the use of georeferenced centipede data, we explore the importance and contribution of bioclimatic variables and land cover, and predict distribution ranges and potential hotspots in Norway. We used a maximum entropy analysis (Maxent) to model species' distributions, aiming at exploring centres of distribution, latitudinal spans and northern range boundaries of centipedes in Norway. The performance of all Maxent models was better than random with average test area under the curve (AUC) values above 0.893 and True Skill Statistic (TSS) values above 0.593. Our results showed a highly significant latitudinal gradient of increased species richness in southern grid-cells. Mean temperatures of warmest and coldest quarters explained much of the potential distribution of species. Predictive modelling analyses revealed that south-eastern Norway and the Atlantic coast in the west (inclusive of the major fjord system of Sognefjord), are local biodiversity hotspots with regard to high predictive species co-occurrence. We conclude that our predicted northward shifts of centipedes' distributions in Norway are likely a result of post-glacial recolonization patterns, species' ecological requirements and dispersal abilities.
Monitoring Method of Cow Anthrax Based on Gis and Spatial Statistical Analysis
NASA Astrophysics Data System (ADS)
Li, Lin; Yang, Yong; Wang, Hongbin; Dong, Jing; Zhao, Yujun; He, Jianbin; Fan, Honggang
Geographic information system (GIS) is a computer application system, which possesses the ability of manipulating spatial information and has been used in many fields related with the spatial information management. Many methods and models have been established for analyzing animal diseases distribution models and temporal-spatial transmission models. Great benefits have been gained from the application of GIS in animal disease epidemiology. GIS is now a very important tool in animal disease epidemiological research. Spatial analysis function of GIS can be widened and strengthened by using spatial statistical analysis, allowing for the deeper exploration, analysis, manipulation and interpretation of spatial pattern and spatial correlation of the animal disease. In this paper, we analyzed the cow anthrax spatial distribution characteristics in the target district A (due to the secret of epidemic data we call it district A) based on the established GIS of the cow anthrax in this district in combination of spatial statistical analysis and GIS. The Cow anthrax is biogeochemical disease, and its geographical distribution is related closely to the environmental factors of habitats and has some spatial characteristics, and therefore the correct analysis of the spatial distribution of anthrax cow for monitoring and the prevention and control of anthrax has a very important role. However, the application of classic statistical methods in some areas is very difficult because of the pastoral nomadic context. The high mobility of livestock and the lack of enough suitable sampling for the some of the difficulties in monitoring currently make it nearly impossible to apply rigorous random sampling methods. It is thus necessary to develop an alternative sampling method, which could overcome the lack of sampling and meet the requirements for randomness. The GIS computer application software ArcGIS9.1 was used to overcome the lack of data of sampling sites.Using ArcGIS 9.1 and GEODA to analyze the cow anthrax spatial distribution of district A. we gained some conclusions about cow anthrax' density: (1) there is a spatial clustering model. (2) there is an intensely spatial autocorrelation. We established a prediction model to estimate the anthrax distribution based on the spatial characteristic of the density of cow anthrax. Comparing with the true distribution, the prediction model has a well coincidence and is feasible to the application. The method using a GIS tool facilitates can be implemented significantly in the cow anthrax monitoring and investigation, and the space statistics - related prediction model provides a fundamental use for other study on space-related animal diseases.
Probability distributions of the electroencephalogram envelope of preterm infants.
Saji, Ryoya; Hirasawa, Kyoko; Ito, Masako; Kusuda, Satoshi; Konishi, Yukuo; Taga, Gentaro
2015-06-01
To determine the stationary characteristics of electroencephalogram (EEG) envelopes for prematurely born (preterm) infants and investigate the intrinsic characteristics of early brain development in preterm infants. Twenty neurologically normal sets of EEGs recorded in infants with a post-conceptional age (PCA) range of 26-44 weeks (mean 37.5 ± 5.0 weeks) were analyzed. Hilbert transform was applied to extract the envelope. We determined the suitable probability distribution of the envelope and performed a statistical analysis. It was found that (i) the probability distributions for preterm EEG envelopes were best fitted by lognormal distributions at 38 weeks PCA or less, and by gamma distributions at 44 weeks PCA; (ii) the scale parameter of the lognormal distribution had positive correlations with PCA as well as a strong negative correlation with the percentage of low-voltage activity; (iii) the shape parameter of the lognormal distribution had significant positive correlations with PCA; (iv) the statistics of mode showed significant linear relationships with PCA, and, therefore, it was considered a useful index in PCA prediction. These statistics, including the scale parameter of the lognormal distribution and the skewness and mode derived from a suitable probability distribution, may be good indexes for estimating stationary nature in developing brain activity in preterm infants. The stationary characteristics, such as discontinuity, asymmetry, and unimodality, of preterm EEGs are well indicated by the statistics estimated from the probability distribution of the preterm EEG envelopes. Copyright © 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Prediction of Down-Gradient Impacts of DNAPL Source Depletion Using Tracer Techniques
NASA Astrophysics Data System (ADS)
Basu, N. B.; Fure, A. D.; Jawitz, J. W.
2006-12-01
Four simplified DNAPL source depletion models that have been discussed in the literature recently are evaluated for the prediction of long-term effects of source depletion under natural gradient flow. These models are simple in form (a power function equation is an example) but are shown here to serve as mathematical analogs to complex multiphase flow and transport simulators. One of the source depletion models, the equilibrium streamtube model, is shown to be relatively easily parameterized using non-reactive and reactive tracers. Non-reactive tracers are used to characterize the aquifer heterogeneity while reactive tracers are used to describe the mean DNAPL mass and its distribution. This information is then used in a Lagrangian framework to predict source remediation performance. In a Lagrangian approach the source zone is conceptualized as a collection of non-interacting streamtubes with hydrodynamic and DNAPL heterogeneity represented by the variation of the travel time and DNAPL saturation among the streamtubes. The travel time statistics are estimated from the non-reactive tracer data while the DNAPL distribution statistics are estimated from the reactive tracer data. The combined statistics are used to define an analytical solution for contaminant dissolution under natural gradient flow. The tracer prediction technique compared favorably with results from a multiphase flow and transport simulator UTCHEM in domains with different hydrodynamic heterogeneity (variance of the log conductivity field = 0.2, 1 and 3).
Pinning time statistics for vortex lines in disordered environments.
Dobramysl, Ulrich; Pleimling, Michel; Täuber, Uwe C
2014-12-01
We study the pinning dynamics of magnetic flux (vortex) lines in a disordered type-II superconductor. Using numerical simulations of a directed elastic line model, we extract the pinning time distributions of vortex line segments. We compare different model implementations for the disorder in the surrounding medium: discrete, localized pinning potential wells that are either attractive and repulsive or purely attractive, and whose strengths are drawn from a Gaussian distribution; as well as continuous Gaussian random potential landscapes. We find that both schemes yield power-law distributions in the pinned phase as predicted by extreme-event statistics, yet they differ significantly in their effective scaling exponents and their short-time behavior.
Chaos and Forecasting - Proceedings of the Royal Society Discussion Meeting
NASA Astrophysics Data System (ADS)
Tong, Howell
1995-04-01
The Table of Contents for the full book PDF is as follows: * Preface * Orthogonal Projection, Embedding Dimension and Sample Size in Chaotic Time Series from a Statistical Perspective * A Theory of Correlation Dimension for Stationary Time Series * On Prediction and Chaos in Stochastic Systems * Locally Optimized Prediction of Nonlinear Systems: Stochastic and Deterministic * A Poisson Distribution for the BDS Test Statistic for Independence in a Time Series * Chaos and Nonlinear Forecastability in Economics and Finance * Paradigm Change in Prediction * Predicting Nonuniform Chaotic Attractors in an Enzyme Reaction * Chaos in Geophysical Fluids * Chaotic Modulation of the Solar Cycle * Fractal Nature in Earthquake Phenomena and its Simple Models * Singular Vectors and the Predictability of Weather and Climate * Prediction as a Criterion for Classifying Natural Time Series * Measuring and Characterising Spatial Patterns, Dynamics and Chaos in Spatially-Extended Dynamical Systems and Ecologies * Non-Linear Forecasting and Chaos in Ecology and Epidemiology: Measles as a Case Study
Statistical mechanics of economics I
NASA Astrophysics Data System (ADS)
Kusmartsev, F. V.
2011-02-01
We show that statistical mechanics is useful in the description of financial crisis and economics. Taking a large amount of instant snapshots of a market over an interval of time we construct their ensembles and study their statistical interference. This results in a probability description of the market and gives capital, money, income, wealth and debt distributions, which in the most cases takes the form of the Bose-Einstein distribution. In addition, statistical mechanics provides the main market equations and laws which govern the correlations between the amount of money, debt, product, prices and number of retailers. We applied the found relations to a study of the evolution of the economics in USA between the years 1996 to 2008 and observe that over that time the income of a major population is well described by the Bose-Einstein distribution which parameters are different for each year. Each financial crisis corresponds to a peak in the absolute activity coefficient. The analysis correctly indicates the past crises and predicts the future one.
Evaluation of probabilistic forecasts with the scoringRules package
NASA Astrophysics Data System (ADS)
Jordan, Alexander; Krüger, Fabian; Lerch, Sebastian
2017-04-01
Over the last decades probabilistic forecasts in the form of predictive distributions have become popular in many scientific disciplines. With the proliferation of probabilistic models arises the need for decision-theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way in order to better understand sources of prediction errors and to improve the models. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. In coherence with decision-theoretical principles they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This contribution presents the software package scoringRules for the statistical programming language R, which provides functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. For univariate variables, two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, ensemble weather forecasts take this form. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices. Recent developments include the addition of scoring rules to evaluate multivariate forecast distributions. The use of the scoringRules package is illustrated in an example on post-processing ensemble forecasts of temperature.
Reliability formulation for the strength and fire endurance of glued-laminated beams
D. A. Bender
A model was developed for predicting the statistical distribution of glued-laminated beam strength and stiffness under normal temperature conditions using available long span modulus of elasticity data, end joint tension test data, and tensile strength data for laminating-grade lumber. The beam strength model predictions compared favorably with test data for glued-...
Brooke L. Bateman; Anna M. Pidgeon; Volker C. Radeloff; Curtis H. Flather; Jeremy VanDerWal; H. Resit Akcakaya; Wayne E. Thogmartin; Thomas P. Albright; Stephen J. Vavrus; Patricia J. Heglund
2016-01-01
Climate conditions, such as temperature or precipitation, averaged over several decades strongly affect species distributions, as evidenced by experimental results and a plethora of models demonstrating statistical relations between species occurrences and long-term climate averages. However, long-term averages can conceal climate changes that have occurred in...
A superstatistical model of metastasis and cancer survival
NASA Astrophysics Data System (ADS)
Leon Chen, L.; Beck, Christian
2008-05-01
We introduce a superstatistical model for the progression statistics of malignant cancer cells. The metastatic cascade is modeled as a complex nonequilibrium system with several macroscopic pathways and inverse-chi-square distributed parameters of the underlying Poisson processes. The predictions of the model are in excellent agreement with observed survival-time probability distributions of breast cancer patients.
Statistics based sampling for controller and estimator design
NASA Astrophysics Data System (ADS)
Tenne, Dirk
The purpose of this research is the development of statistical design tools for robust feed-forward/feedback controllers and nonlinear estimators. This dissertation is threefold and addresses the aforementioned topics nonlinear estimation, target tracking and robust control. To develop statistically robust controllers and nonlinear estimation algorithms, research has been performed to extend existing techniques, which propagate the statistics of the state, to achieve higher order accuracy. The so-called unscented transformation has been extended to capture higher order moments. Furthermore, higher order moment update algorithms based on a truncated power series have been developed. The proposed techniques are tested on various benchmark examples. Furthermore, the unscented transformation has been utilized to develop a three dimensional geometrically constrained target tracker. The proposed planar circular prediction algorithm has been developed in a local coordinate framework, which is amenable to extension of the tracking algorithm to three dimensional space. This tracker combines the predictions of a circular prediction algorithm and a constant velocity filter by utilizing the Covariance Intersection. This combined prediction can be updated with the subsequent measurement using a linear estimator. The proposed technique is illustrated on a 3D benchmark trajectory, which includes coordinated turns and straight line maneuvers. The third part of this dissertation addresses the design of controller which include knowledge of parametric uncertainties and their distributions. The parameter distributions are approximated by a finite set of points which are calculated by the unscented transformation. This set of points is used to design robust controllers which minimize a statistical performance of the plant over the domain of uncertainty consisting of a combination of the mean and variance. The proposed technique is illustrated on three benchmark problems. The first relates to the design of prefilters for a linear and nonlinear spring-mass-dashpot system and the second applies a feedback controller to a hovering helicopter. Lastly, the statistical robust controller design is devoted to a concurrent feed-forward/feedback controller structure for a high-speed low tension tape drive.
ERIC Educational Resources Information Center
Sharma, Kshitij; Chavez-Demoulin, Valérie; Dillenbourg, Pierre
2017-01-01
The statistics used in education research are based on central trends such as the mean or standard deviation, discarding outliers. This paper adopts another viewpoint that has emerged in statistics, called extreme value theory (EVT). EVT claims that the bulk of normal distribution is comprised mainly of uninteresting variations while the most…
Lee, L.; Helsel, D.
2005-01-01
Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
A perceptual space of local image statistics.
Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M
2015-12-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
A perceptual space of local image statistics
Victor, Jonathan D.; Thengone, Daniel J.; Rizvi, Syed M.; Conte, Mary M.
2015-01-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice – a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14 min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4 min. In sum, local image statistics forms a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. PMID:26130606
Use of model calibration to achieve high accuracy in analysis of computer networks
Frogner, Bjorn; Guarro, Sergio; Scharf, Guy
2004-05-11
A system and method are provided for creating a network performance prediction model, and calibrating the prediction model, through application of network load statistical analyses. The method includes characterizing the measured load on the network, which may include background load data obtained over time, and may further include directed load data representative of a transaction-level event. Probabilistic representations of load data are derived to characterize the statistical persistence of the network performance variability and to determine delays throughout the network. The probabilistic representations are applied to the network performance prediction model to adapt the model for accurate prediction of network performance. Certain embodiments of the method and system may be used for analysis of the performance of a distributed application characterized as data packet streams.
Random walk to a nonergodic equilibrium concept
NASA Astrophysics Data System (ADS)
Bel, G.; Barkai, E.
2006-01-01
Random walk models, such as the trap model, continuous time random walks, and comb models, exhibit weak ergodicity breaking, when the average waiting time is infinite. The open question is, what statistical mechanical theory replaces the canonical Boltzmann-Gibbs theory for such systems? In this paper a nonergodic equilibrium concept is investigated, for a continuous time random walk model in a potential field. In particular we show that in the nonergodic phase the distribution of the occupation time of the particle in a finite region of space approaches U- or W-shaped distributions related to the arcsine law. We show that when conditions of detailed balance are applied, these distributions depend on the partition function of the problem, thus establishing a relation between the nonergodic dynamics and canonical statistical mechanics. In the ergodic phase the distribution function of the occupation times approaches a δ function centered on the value predicted based on standard Boltzmann-Gibbs statistics. The relation of our work to single-molecule experiments is briefly discussed.
Applications of Bayesian Statistics to Problems in Gamma-Ray Bursts
NASA Technical Reports Server (NTRS)
Meegan, Charles A.
1997-01-01
This presentation will describe two applications of Bayesian statistics to Gamma Ray Bursts (GRBS). The first attempts to quantify the evidence for a cosmological versus galactic origin of GRBs using only the observations of the dipole and quadrupole moments of the angular distribution of bursts. The cosmological hypothesis predicts isotropy, while the galactic hypothesis is assumed to produce a uniform probability distribution over positive values for these moments. The observed isotropic distribution indicates that the Bayes factor for the cosmological hypothesis over the galactic hypothesis is about 300. Another application of Bayesian statistics is in the estimation of chance associations of optical counterparts with galaxies. The Bayesian approach is preferred to frequentist techniques here because the Bayesian approach easily accounts for galaxy mass distributions and because one can incorporate three disjoint hypotheses: (1) bursts come from galactic centers, (2) bursts come from galaxies in proportion to luminosity, and (3) bursts do not come from external galaxies. This technique was used in the analysis of the optical counterpart to GRB970228.
Velocity bias in the distribution of dark matter halos
NASA Astrophysics Data System (ADS)
Baldauf, Tobias; Desjacques, Vincent; Seljak, Uroš
2015-12-01
The standard formalism for the coevolution of halos and dark matter predicts that any initial halo velocity bias rapidly decays to zero. We argue that, when the purpose is to compute statistics like power spectra etc., the coupling in the momentum conservation equation for the biased tracers must be modified. Our new formulation predicts the constancy in time of any statistical halo velocity bias present in the initial conditions, in agreement with peak theory. We test this prediction by studying the evolution of a conserved halo population in N -body simulations. We establish that the initial simulated halo density and velocity statistics show distinct features of the peak model and, thus, deviate from the simple local Lagrangian bias. We demonstrate, for the first time, that the time evolution of their velocity is in tension with the rapid decay expected in the standard approach.
NASA Astrophysics Data System (ADS)
Gros, J.-B.; Kuhl, U.; Legrand, O.; Mortessagne, F.
2016-03-01
The effective Hamiltonian formalism is extended to vectorial electromagnetic waves in order to describe statistical properties of the field in reverberation chambers. The latter are commonly used in electromagnetic compatibility tests. As a first step, the distribution of wave intensities in chaotic systems with varying opening in the weak coupling limit for scalar quantum waves is derived by means of random matrix theory. In this limit the only parameters are the modal overlap and the number of open channels. Using the extended effective Hamiltonian, we describe the intensity statistics of the vectorial electromagnetic eigenmodes of lossy reverberation chambers. Finally, the typical quantity of interest in such chambers, namely, the distribution of the electromagnetic response, is discussed. By determining the distribution of the phase rigidity, describing the coupling to the environment, using random matrix numerical data, we find good agreement between the theoretical prediction and numerical calculations of the response.
NASA Astrophysics Data System (ADS)
Bobrowski, Maria; Schickhoff, Udo
2017-04-01
Betula utilis is a major constituent of alpine treeline ecotones in the western and central Himalayan region. The objective of this study is to provide first time analysis of the potential distribution of Betula utilis in the subalpine and alpine belts of the Himalayan region using species distribution modelling. Using Generalized Linear Models (GLM) we aim at examining climatic factors controlling the species distribution under current climate conditions. Furthermore we evaluate the prediction ability of climate data derived from different statistical methods. GLMs were created using least correlated bioclimatic variables derived from two different climate models: 1) interpolated climate data (i.e. Worldclim, Hijmans et al., 2005) and 2) quasi-mechanistical statistical downscaling (i.e. Chelsa; Karger et al., 2016). Model accuracy was evaluated by the ability to predict the potential species distribution range. We found that models based on variables of Chelsa climate data had higher predictive power, whereas models using Worldclim climate data consistently overpredicted the potential suitable habitat for Betula utilis. Although climatic variables of Worldclim are widely used in modelling species distribution, our results suggest to treat them with caution when remote regions like the Himalayan mountains are in focus. Unmindful usage of climatic variables for species distribution models potentially cause misleading projections and may lead to wrong implications and recommendations for nature conservation. References: Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A. (2005) Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25, 1965-1978. Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N., Linder, H.P. & Kessler, M. (2016) Climatologies at high resolution for the earth land surface areas. arXiv:1607.00217 [physics].
NASA Technical Reports Server (NTRS)
Holms, A. G.
1980-01-01
Population model coefficients were chosen to simulate a saturated 2 to the fourth power fixed effects experiment having an unfavorable distribution of relative values. Using random number studies, deletion strategies were compared that were based on the F distribution, on an order statistics distribution of Cochran's, and on a combination of the two. Results of the comparisons and a recommended strategy are given.
Durand, Marc; Käfer, Jos; Quilliet, Catherine; Cox, Simon; Talebi, Shirin Ataei; Graner, François
2011-10-14
We propose an analytical model for the statistical mechanics of shuffled two-dimensional foams with moderate bubble size polydispersity. It predicts without any adjustable parameters the correlations between the number of sides n of the bubbles (topology) and their areas A (geometry) observed in experiments and numerical simulations of shuffled foams. Detailed statistics show that in shuffled cellular patterns n correlates better with √A (as claimed by Desch and Feltham) than with A (as claimed by Lewis and widely assumed in the literature). At the level of the whole foam, standard deviations Δn and ΔA are in proportion. Possible applications include correlations of the detailed distributions of n and A, three-dimensional foams, and biological tissues.
System Analysis for the Huntsville Operation Support Center, Distributed Computer System
NASA Technical Reports Server (NTRS)
Ingels, F. M.; Massey, D.
1985-01-01
HOSC as a distributed computing system, is responsible for data acquisition and analysis during Space Shuttle operations. HOSC also provides computing services for Marshall Space Flight Center's nonmission activities. As mission and nonmission activities change, so do the support functions of HOSC change, demonstrating the need for some method of simulating activity at HOSC in various configurations. The simulation developed in this work primarily models the HYPERchannel network. The model simulates the activity of a steady state network, reporting statistics such as, transmitted bits, collision statistics, frame sequences transmitted, and average message delay. These statistics are used to evaluate such performance indicators as throughout, utilization, and delay. Thus the overall performance of the network is evaluated, as well as predicting possible overload conditions.
Domain generality vs. modality specificity: The paradox of statistical learning
Frost, Ram; Armstrong, Blair C.; Siegelman, Noam; Christiansen, Morten H.
2015-01-01
Statistical learning is typically considered to be a domain-general mechanism by which cognitive systems discover the underlying distributional properties of the input. Recent studies examining whether there are commonalities in the learning of distributional information across different domains or modalities consistently reveal, however, modality and stimulus specificity. An important question is, therefore, how and why a hypothesized domain-general learning mechanism systematically produces such effects. We offer a theoretical framework according to which statistical learning is not a unitary mechanism, but a set of domain-general computational principles, that operate in different modalities and therefore are subject to the specific constraints characteristic of their respective brain regions. This framework offers testable predictions and we discuss its computational and neurobiological plausibility. PMID:25631249
Forecasts of non-Gaussian parameter spaces using Box-Cox transformations
NASA Astrophysics Data System (ADS)
Joachimi, B.; Taylor, A. N.
2011-09-01
Forecasts of statistical constraints on model parameters using the Fisher matrix abound in many fields of astrophysics. The Fisher matrix formalism involves the assumption of Gaussianity in parameter space and hence fails to predict complex features of posterior probability distributions. Combining the standard Fisher matrix with Box-Cox transformations, we propose a novel method that accurately predicts arbitrary posterior shapes. The Box-Cox transformations are applied to parameter space to render it approximately multivariate Gaussian, performing the Fisher matrix calculation on the transformed parameters. We demonstrate that, after the Box-Cox parameters have been determined from an initial likelihood evaluation, the method correctly predicts changes in the posterior when varying various parameters of the experimental setup and the data analysis, with marginally higher computational cost than a standard Fisher matrix calculation. We apply the Box-Cox-Fisher formalism to forecast cosmological parameter constraints by future weak gravitational lensing surveys. The characteristic non-linear degeneracy between matter density parameter and normalization of matter density fluctuations is reproduced for several cases, and the capabilities of breaking this degeneracy by weak-lensing three-point statistics is investigated. Possible applications of Box-Cox transformations of posterior distributions are discussed, including the prospects for performing statistical data analysis steps in the transformed Gaussianized parameter space.
Pace, M.N.; Rosentreter, J.J.; Bartholomay, R.C.
2001-01-01
Idaho State University and the US Geological Survey, in cooperation with the US Department of Energy, conducted a study to determine and evaluate strontium distribution coefficients (Kds) of subsurface materials at the Idaho National Engineering and Environmental Laboratory (INEEL). The Kds were determined to aid in assessing the variability of strontium Kds and their effects on chemical transport of strontium-90 in the Snake River Plain aquifer system. Data from batch experiments done to determine strontium Kds of five sediment-infill samples and six standard reference material samples were analyzed by using multiple linear regression analysis and the stepwise variable-selection method in the statistical program, Statistical Product and Service Solutions, to derive an equation of variables that can be used to predict strontium Kds of sediment-infill samples. The sediment-infill samples were from basalt vesicles and fractures from a selected core at the INEEL; strontium Kds ranged from ???201 to 356 ml g-1. The standard material samples consisted of clay minerals and calcite. The statistical analyses of the batch-experiment results showed that the amount of strontium in the initial solution, the amount of manganese oxide in the sample material, and the amount of potassium in the initial solution are the most important variables in predicting strontium Kds of sediment-infill samples.
Outcome of temporal lobe epilepsy surgery predicted by statistical parametric PET imaging.
Wong, C Y; Geller, E B; Chen, E Q; MacIntyre, W J; Morris, H H; Raja, S; Saha, G B; Lüders, H O; Cook, S A; Go, R T
1996-07-01
PET is useful in the presurgical evaluation of temporal lobe epilepsy. The purpose of this retrospective study is to assess the clinical use of statistical parametric imaging in predicting surgical outcome. Interictal 18FDG-PET scans in 17 patients with surgically-treated temporal lobe epilepsy (Group A-13 seizure-free, group B = 4 not seizure-free at 6 mo) were transformed into statistical parametric imaging, with each pixel representing a z-score value by using the mean and s.d. of count distribution in each individual patient, for both visual and quantitative analysis. Mean z-scores were significantly more negative in anterolateral (AL) and mesial (M) regions on the operated side than the nonoperated side in group A (AL: p < 0.00005, M: p = 0.0097), but not in group B (AL: p = 0.46, M: p = 0.08). Statistical parametric imaging correctly lateralized 16 out of 17 patients. Only the AL region, however, was significant in predicting surgical outcome (F = 29.03, p < 0.00005). Using a cut-off z-score value of -1.5, statistical parametric imaging correctly classified 92% of temporal lobes from group A and 88% of those from Group B. The preliminary results indicate that statistical parametric imaging provides both clinically useful information for lateralization in temporal lobe epilepsy and a reliable predictive indicator of clinical outcome following surgical treatment.
A two-component rain model for the prediction of attenuation and diversity improvement
NASA Technical Reports Server (NTRS)
Crane, R. K.
1982-01-01
A new model was developed to predict attenuation statistics for a single Earth-satellite or terrestrial propagation path. The model was extended to provide predictions of the joint occurrences of specified or higher attenuation values on two closely spaced Earth-satellite paths. The joint statistics provide the information required to obtain diversity gain or diversity advantage estimates. The new model is meteorologically based. It was tested against available Earth-satellite beacon observations and terrestrial path measurements. The model employs the rain climate region descriptions of the Global rain model. The rms deviation between the predicted and observed attenuation values for the terrestrial path data was 35 percent, a result consistent with the expectations of the Global model when the rain rate distribution for the path is not used in the calculation. Within the United States the rms deviation between measurement and prediction was 36 percent but worldwide it was 79 percent.
NASA Astrophysics Data System (ADS)
Niino, Yuu
2018-05-01
We investigate how the statistical properties of dispersion measure (DM) and apparent flux density/fluence of (nonrepeating) fast radio bursts (FRBs) are determined by unknown cosmic rate density history [ρ FRB(z)] and luminosity function (LF) of the transient events. We predict the distributions of DMs, flux densities, and fluences of FRBs taking account of the variation of the receiver efficiency within its beam, using analytical models of ρ FRB(z) and LF. Comparing the predictions with the observations, we show that the cumulative distribution of apparent fluences suggests that FRBs originate at cosmological distances and ρ FRB increases with redshift resembling the cosmic star formation history (CSFH). We also show that an LF model with a bright-end cutoff at log10 L ν (erg s‑1 Hz‑1) ∼ 34 are favored to reproduce the observed DM distribution if ρ FRB(z) ∝ CSFH, although the statistical significance of the constraints obtained with the current size of the observed sample is not high. Finally, we find that the correlation between DM and flux density of FRBs is potentially a powerful tool to distinguish whether FRBs are at cosmological distances or in the local universe more robustly with future observations.
Noise and the statistical mechanics of distributed transport in a colony of interacting agents
NASA Astrophysics Data System (ADS)
Katifori, Eleni; Graewer, Johannes; Ronellenfitsch, Henrik; Mazza, Marco G.
Inspired by the process of liquid food distribution between individuals in an ant colony, in this work we consider the statistical mechanics of resource dissemination between interacting agents with finite carrying capacity. The agents move inside a confined space (nest), pick up the food at the entrance of the nest and share it with other agents that they encounter. We calculate analytically and via a series of simulations the global food intake rate for the whole colony as well as observables describing how uniformly the food is distributed within the nest. Our model and predictions provide a useful benchmark to assess which strategies can lead to efficient food distribution within the nest and also to what level the observed food uptake rates and efficiency in food distribution are due to stochastic fluctuations or specific food exchange strategies by an actual ant colony.
Non-equilibrium thermionic electron emission for metals at high temperatures
NASA Astrophysics Data System (ADS)
Domenech-Garret, J. L.; Tierno, S. P.; Conde, L.
2015-08-01
Stationary thermionic electron emission currents from heated metals are compared against an analytical expression derived using a non-equilibrium quantum kappa energy distribution for the electrons. The latter depends on the temperature decreasing parameter κ ( T ) , which decreases with increasing temperature and can be estimated from raw experimental data and characterizes the departure of the electron energy spectrum from equilibrium Fermi-Dirac statistics. The calculations accurately predict the measured thermionic emission currents for both high and moderate temperature ranges. The Richardson-Dushman law governs electron emission for large values of kappa or equivalently, moderate metal temperatures. The high energy tail in the electron energy distribution function that develops at higher temperatures or lower kappa values increases the emission currents well over the predictions of the classical expression. This also permits the quantitative estimation of the departure of the metal electrons from the equilibrium Fermi-Dirac statistics.
Methods of Information Geometry to model complex shapes
NASA Astrophysics Data System (ADS)
De Sanctis, A.; Gattone, S. A.
2016-09-01
In this paper, a new statistical method to model patterns emerging in complex systems is proposed. A framework for shape analysis of 2- dimensional landmark data is introduced, in which each landmark is represented by a bivariate Gaussian distribution. From Information Geometry we know that Fisher-Rao metric endows the statistical manifold of parameters of a family of probability distributions with a Riemannian metric. Thus this approach allows to reconstruct the intermediate steps in the evolution between observed shapes by computing the geodesic, with respect to the Fisher-Rao metric, between the corresponding distributions. Furthermore, the geodesic path can be used for shape predictions. As application, we study the evolution of the rat skull shape. A future application in Ophthalmology is introduced.
Negative impacts of climate change on cereal yields: statistical evidence from France
NASA Astrophysics Data System (ADS)
Gammans, Matthew; Mérel, Pierre; Ortiz-Bobea, Ariel
2017-05-01
In several world regions, climate change is predicted to negatively affect crop productivity. The recent statistical yield literature emphasizes the importance of flexibly accounting for the distribution of growing-season temperature to better represent the effects of warming on crop yields. We estimate a flexible statistical yield model using a long panel from France to investigate the impacts of temperature and precipitation changes on wheat and barley yields. Winter varieties appear sensitive to extreme cold after planting. All yields respond negatively to an increase in spring-summer temperatures and are a decreasing function of precipitation about historical precipitation levels. Crop yields are predicted to be negatively affected by climate change under a wide range of climate models and emissions scenarios. Under warming scenario RCP8.5 and holding growing areas and technology constant, our model ensemble predicts a 21.0% decline in winter wheat yield, a 17.3% decline in winter barley yield, and a 33.6% decline in spring barley yield by the end of the century. Uncertainty from climate projections dominates uncertainty from the statistical model. Finally, our model predicts that continuing technology trends would counterbalance most of the effects of climate change.
Alternative evaluation metrics for risk adjustment methods.
Park, Sungchul; Basu, Anirban
2018-06-01
Risk adjustment is instituted to counter risk selection by accurately equating payments with expected expenditures. Traditional risk-adjustment methods are designed to estimate accurate payments at the group level. However, this generates residual risks at the individual level, especially for high-expenditure individuals, thereby inducing health plans to avoid those with high residual risks. To identify an optimal risk-adjustment method, we perform a comprehensive comparison of prediction accuracies at the group level, at the tail distributions, and at the individual level across 19 estimators: 9 parametric regression, 7 machine learning, and 3 distributional estimators. Using the 2013-2014 MarketScan database, we find that no one estimator performs best in all prediction accuracies. Generally, machine learning and distribution-based estimators achieve higher group-level prediction accuracy than parametric regression estimators. However, parametric regression estimators show higher tail distribution prediction accuracy and individual-level prediction accuracy, especially at the tails of the distribution. This suggests that there is a trade-off in selecting an appropriate risk-adjustment method between estimating accurate payments at the group level and lower residual risks at the individual level. Our results indicate that an optimal method cannot be determined solely on the basis of statistical metrics but rather needs to account for simulating plans' risk selective behaviors. Copyright © 2018 John Wiley & Sons, Ltd.
Cooper, Emily A.; Norcia, Anthony M.
2015-01-01
The nervous system has evolved in an environment with structure and predictability. One of the ubiquitous principles of sensory systems is the creation of circuits that capitalize on this predictability. Previous work has identified predictable non-uniformities in the distributions of basic visual features in natural images that are relevant to the encoding tasks of the visual system. Here, we report that the well-established statistical distributions of visual features -- such as visual contrast, spatial scale, and depth -- differ between bright and dark image components. Following this analysis, we go on to trace how these differences in natural images translate into different patterns of cortical input that arise from the separate bright (ON) and dark (OFF) pathways originating in the retina. We use models of these early visual pathways to transform natural images into statistical patterns of cortical input. The models include the receptive fields and non-linear response properties of the magnocellular (M) and parvocellular (P) pathways, with their ON and OFF pathway divisions. The results indicate that there are regularities in visual cortical input beyond those that have previously been appreciated from the direct analysis of natural images. In particular, several dark/bright asymmetries provide a potential account for recently discovered asymmetries in how the brain processes visual features, such as violations of classic energy-type models. On the basis of our analysis, we expect that the dark/bright dichotomy in natural images plays a key role in the generation of both cortical and perceptual asymmetries. PMID:26020624
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1977-01-01
Disdrometer measurements and radar reflectivity measurements were injected into a computer program to estimate the path attenuation of the signal. Predicted attenuations when compared with the directly measured ones showed generally good correlation on a case by case basis and very good agreement statistically. The utility of using radar in conjunction with disdrometer measurements for predicting fade events and long term fade distributions associated with earth-satellite telecommunications is demonstrated.
Jarnevich, Catherine S.; Young, Nicholas E; Sheffels, Trevor R.; Carter, Jacoby; Systma, Mark D.; Talbert, Colin
2017-01-01
Invasive species provide a unique opportunity to evaluate factors controlling biogeographic distributions; we can consider introduction success as an experiment testing suitability of environmental conditions. Predicting potential distributions of spreading species is not easy, and forecasting potential distributions with changing climate is even more difficult. Using the globally invasive coypu (Myocastor coypus [Molina, 1782]), we evaluate and compare the utility of a simplistic ecophysiological based model and a correlative model to predict current and future distribution. The ecophysiological model was based on winter temperature relationships with nutria survival. We developed correlative statistical models using the Software for Assisted Habitat Modeling and biologically relevant climate data with a global extent. We applied the ecophysiological based model to several global circulation model (GCM) predictions for mid-century. We used global coypu introduction data to evaluate these models and to explore a hypothesized physiological limitation, finding general agreement with known coypu distribution locally and globally and support for an upper thermal tolerance threshold. Global circulation model based model results showed variability in coypu predicted distribution among GCMs, but had general agreement of increasing suitable area in the USA. Our methods highlighted the dynamic nature of the edges of the coypu distribution due to climate non-equilibrium, and uncertainty associated with forecasting future distributions. Areas deemed suitable habitat, especially those on the edge of the current known range, could be used for early detection of the spread of coypu populations for management purposes. Combining approaches can be beneficial to predicting potential distributions of invasive species now and in the future and in exploring hypotheses of factors controlling distributions.
Distribution drivers and physiological responses in geothermal bryophyte communities.
García, Estefanía Llaneza; Rosenstiel, Todd N; Graves, Camille; Shortlidge, Erin E; Eppley, Sarah M
2016-04-01
Our ability to explain community structure rests on our ability to define the importance of ecological niches, including realized ecological niches, in shaping communities, but few studies of plant distributions have combined predictive models with physiological measures. Using field surveys and statistical modeling, we predicted distribution drivers in geothermal bryophyte (moss) communities of Lassen Volcanic National Park (California, USA). In the laboratory, we used drying and rewetting experiments to test whether the strong species-specific effects of relative humidity on distributions predicted by the models were correlated with physiological characters. We found that the three most common bryophytes in geothermal communities were significantly affected by three distinct distribution drivers: temperature, light, and relative humidity. Aulacomnium palustre, whose distribution is significantly affected by relative humidity according to our model, and which occurs in high-humidity sites, showed extreme signs of stress after drying and never recovered optimal values of PSII efficiency after rewetting. Campylopus introflexus, whose distribution is not affected by humidity according to our model, was able to maintain optimal values of PSII efficiency for 48 hr at 50% water loss and recovered optimal values of PSII efficiency after rewetting. Our results suggest that species-specific environmental stressors tightly constrain the ecological niches of geothermal bryophytes. Tests of tolerance to drying in two bryophyte species corresponded with model predictions of the comparative importance of relative humidity as distribution drivers for these species. © 2016 Botanical Society of America.
Analysis of Machine Learning Techniques for Heart Failure Readmissions.
Mortazavi, Bobak J; Downing, Nicholas S; Bucholz, Emily M; Dharmarajan, Kumar; Manhapra, Ajay; Li, Shu-Xia; Negahban, Sahand N; Krumholz, Harlan M
2016-11-01
The current ability to predict readmissions in patients with heart failure is modest at best. It is unclear whether machine learning techniques that address higher dimensional, nonlinear relationships among variables would enhance prediction. We sought to compare the effectiveness of several machine learning algorithms for predicting readmissions. Using data from the Telemonitoring to Improve Heart Failure Outcomes trial, we compared the effectiveness of random forests, boosting, random forests combined hierarchically with support vector machines or logistic regression (LR), and Poisson regression against traditional LR to predict 30- and 180-day all-cause readmissions and readmissions because of heart failure. We randomly selected 50% of patients for a derivation set, and a validation set comprised the remaining patients, validated using 100 bootstrapped iterations. We compared C statistics for discrimination and distributions of observed outcomes in risk deciles for predictive range. In 30-day all-cause readmission prediction, the best performing machine learning model, random forests, provided a 17.8% improvement over LR (mean C statistics, 0.628 and 0.533, respectively). For readmissions because of heart failure, boosting improved the C statistic by 24.9% over LR (mean C statistic 0.678 and 0.543, respectively). For 30-day all-cause readmission, the observed readmission rates in the lowest and highest deciles of predicted risk with random forests (7.8% and 26.2%, respectively) showed a much wider separation than LR (14.2% and 16.4%, respectively). Machine learning methods improved the prediction of readmission after hospitalization for heart failure compared with LR and provided the greatest predictive range in observed readmission rates. © 2016 American Heart Association, Inc.
NASA Astrophysics Data System (ADS)
Coclite, A.; Pascazio, G.; De Palma, P.; Cutrone, L.
2016-07-01
Flamelet-Progress-Variable (FPV) combustion models allow the evaluation of all thermochemical quantities in a reacting flow by computing only the mixture fraction Z and a progress variable C. When using such a method to predict turbulent combustion in conjunction with a turbulence model, a probability density function (PDF) is required to evaluate statistical averages (e. g., Favre averages) of chemical quantities. The choice of the PDF is a compromise between computational costs and accuracy level. The aim of this paper is to investigate the influence of the PDF choice and its modeling aspects to predict turbulent combustion. Three different models are considered: the standard one, based on the choice of a β-distribution for Z and a Dirac-distribution for C; a model employing a β-distribution for both Z and C; and the third model obtained using a β-distribution for Z and the statistically most likely distribution (SMLD) for C. The standard model, although widely used, does not take into account the interaction between turbulence and chemical kinetics as well as the dependence of the progress variable not only on its mean but also on its variance. The SMLD approach establishes a systematic framework to incorporate informations from an arbitrary number of moments, thus providing an improvement over conventionally employed presumed PDF closure models. The rational behind the choice of the three PDFs is described in some details and the prediction capability of the corresponding models is tested vs. well-known test cases, namely, the Sandia flames, and H2-air supersonic combustion.
Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.
2016-01-01
Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Solar Radio Burst Statistics and Implications for Space Weather Effects
NASA Astrophysics Data System (ADS)
Giersch, O. D.; Kennewell, J.; Lynch, M.
2017-11-01
Solar radio bursts have the potential to affect space and terrestrial navigation, communication, and other technical systems that are sometimes overlooked. However, over the last decade a series of extreme L band solar radio bursts in December 2006 have renewed interest in these effects. In this paper we point out significant deficiencies in the solar radio data archives of the National Centers for Environmental Information (NCEI) that are used by most researchers in analyzing and producing statistics on solar radio burst phenomena. In particular, we examine the records submitted by the United States Air Force (USAF) Radio Solar Telescope Network (RSTN) and its predecessors from the period 1966 to 2010. Besides identifying substantial missing burst records we show that different observatories can have statistically different burst distributions, particularly at 245 MHz. We also point out that different solar cycles may show statistically different distributions and that it is a mistake to assume that the Sun shows similar behavior in different sunspot cycles. Large solar radio bursts are not confined to the period around sunspot maximum, and prediction of such events that utilize historical data will invariably be an underestimate due to archive data deficiencies. It is important that researchers and forecasters use historical occurrence frequency with caution in attempting to predict future cycles.
Magnetic Reconnection during Turbulence: Statistics of X-Points and Heating
NASA Astrophysics Data System (ADS)
Shay, M. A.; Haggerty, C. C.; Parashar, T.; Matthaeus, W. H.; Phan, T.; Drake, J. F.; Servidio, S.; Wan, M.
2017-12-01
Magnetic reconnection is a ubiquitous plasma phenomenon that has been observed in turbulent plasma systems. It is an important part of the turbulent dynamics and heating of space, laboratory and astrophysical plasmas. Recent simulation and observational studies have detailed how magnetic reconnection heats plasma and this work has developed to the point where it can be applied to larger and more complex plasma systems. In this context, we examine the statistics of magnetic reconnection in fully kinetic PIC simulations to quantify the role of magnetic reconnection on energy dissipation and plasma heating. Most notably, we study the time evolution of these x-line statistics in decaying turbulence. First, we examine the distribution of reconnection rates at the x-points found in the simulation and find that their distribution is broader than the MHD counterpart, and the average value is approximately 0.1. Second, we study the time evolution of the x-points to determine when reconnection is most active in the turbulence. Finally, using our findings on these statistics, reconnection heating predictions are applied to the regions surrounding the identified x-points and this is used to study the role of magnetic reconnection in turbulent heating of plasma. The ratio of ion to electron heating rates is found to be consistent with magnetic reconnection predictions.
UXO Burial Prediction Fidelity: A Summary
2017-07-01
should not be construed as representing the official position of either the Department of Defense or the sponsoring organization. For More Information ...equilibrium. Any complete picture of munition evolution in sediment would need to account for these effects. More relevant to the present topic: these...of adds uncertainty to predictions of munition fate, and assessments of risk probabilities would need to account for the statistical distribution of
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1984-01-01
Single and joint terminal slant path attenuation statistics at frequencies of 28.56 and 19.04 GHz have been derived, employing a radar data base obtained over a three-year period at Wallops Island, VA. Statistics were independently obtained for path elevation angles of 20, 45, and 90 deg for purposes of examining how elevation angles influences both single-terminal and joint probability distributions. Both diversity gains and autocorrelation function dependence on site spacing and elevation angles were determined employing the radar modeling results. Comparisons with other investigators are presented. An independent path elevation angle prediction technique was developed and demonstrated to fit well with the radar-derived single and joint terminal radar-derived cumulative fade distributions at various elevation angles.
Applications of species distribution modeling to paleobiology
NASA Astrophysics Data System (ADS)
Svenning, Jens-Christian; Fløjgaard, Camilla; Marske, Katharine A.; Nógues-Bravo, David; Normand, Signe
2011-10-01
Species distribution modeling (SDM: statistical and/or mechanistic approaches to the assessment of range determinants and prediction of species occurrence) offers new possibilities for estimating and studying past organism distributions. SDM complements fossil and genetic evidence by providing (i) quantitative and potentially high-resolution predictions of the past organism distributions, (ii) statistically formulated, testable ecological hypotheses regarding past distributions and communities, and (iii) statistical assessment of range determinants. In this article, we provide an overview of applications of SDM to paleobiology, outlining the methodology, reviewing SDM-based studies to paleobiology or at the interface of paleo- and neobiology, discussing assumptions and uncertainties as well as how to handle them, and providing a synthesis and outlook. Key methodological issues for SDM applications to paleobiology include predictor variables (types and properties; special emphasis is given to paleoclimate), model validation (particularly important given the emphasis on cross-temporal predictions in paleobiological applications), and the integration of SDM and genetics approaches. Over the last few years the number of studies using SDM to address paleobiology-related questions has increased considerably. While some of these studies only use SDM (23%), most combine them with genetically inferred patterns (49%), paleoecological records (22%), or both (6%). A large number of SDM-based studies have addressed the role of Pleistocene glacial refugia in biogeography and evolution, especially in Europe, but also in many other regions. SDM-based approaches are also beginning to contribute to a suite of other research questions, such as historical constraints on current distributions and diversity patterns, the end-Pleistocene megafaunal extinctions, past community assembly, human paleobiogeography, Holocene paleoecology, and even deep-time biogeography (notably, providing insights into biogeographic dynamics >400 million years ago). We discuss important assumptions and uncertainties that affect the SDM approach to paleobiology - the equilibrium postulate, niche stability, changing atmospheric CO 2 concentrations - as well as ways to address these (ensemble, functional SDM, and non-SDM ecoinformatics approaches). We conclude that the SDM approach offers important opportunities for advances in paleobiology by providing a quantitative ecological perspective, and hereby also offers the potential for an enhanced contribution of paleobiology to ecology and conservation biology, e.g., for estimating climate change impacts and for informing ecological restoration.
Rolling Bearing Life Prediction-Past, Present, and Future
NASA Technical Reports Server (NTRS)
Zaretsky, E V; Poplawski, J. V.; Miller, C. R.
2000-01-01
Comparisons were made between the life prediction formulas of Lundberg and Palmgren, Ioannides and Harris, and Zaretsky and full-scale ball and roller bearing life data. The effect of Weibull slope on bearing life prediction was determined. Life factors are proposed to adjust the respective life formulas to the normalized statistical life distribution of each bearing type. The Lundberg-Palmgren method resulted in the most conservative life predictions compared to Ioannides and Harris, and Zaretsky methods which produced statistically similar results. Roller profile can have significant effects on bearing life prediction results. Roller edge loading can reduce life by as much as 98 percent. The resultant predicted life not only depends on the life equation used but on the Weibull slope assumed, the least variation occurring with the Zaretsky equation. The load-life exponent p of 10/3 used in the American National Standards Institute (ANSI)/American Bearing Manufacturers Association (ABMA)/International Organization for Standardization (ISO) standards is inconsistent with the majority roller bearings designed and used today.
Relative mass distributions of neutron-rich thermally fissile nuclei within a statistical model
NASA Astrophysics Data System (ADS)
Kumar, Bharat; Kannan, M. T. Senthil; Balasubramaniam, M.; Agrawal, B. K.; Patra, S. K.
2017-09-01
We study the binary mass distribution for the recently predicted thermally fissile neutron-rich uranium and thorium nuclei using a statistical model. The level density parameters needed for the study are evaluated from the excitation energies of the temperature-dependent relativistic mean field formalism. The excitation energy and the level density parameter for a given temperature are employed in the convolution integral method to obtain the probability of the particular fragmentation. As representative cases, we present the results for the binary yields of 250U and 254Th. The relative yields are presented for three different temperatures: T =1 , 2, and 3 MeV.
NASA Astrophysics Data System (ADS)
Sheshukov, Aleksey Y.; Sekaluvu, Lawrence; Hutchinson, Stacy L.
2018-04-01
Topographic index (TI) models have been widely used to predict trajectories and initiation points of ephemeral gullies (EGs) in agricultural landscapes. Prediction of EGs strongly relies on the selected value of critical TI threshold, and the accuracy depends on topographic features, agricultural management, and datasets of observed EGs. This study statistically evaluated the predictions by TI models in two paired watersheds in Central Kansas that had different levels of structural disturbances due to implemented conservation practices. Four TI models with sole dependency on topographic factors of slope, contributing area, and planform curvature were used in this study. The observed EGs were obtained by field reconnaissance and through the process of hydrological reconditioning of digital elevation models (DEMs). The Kernel Density Estimation analysis was used to evaluate TI distribution within a 10-m buffer of the observed EG trajectories. The EG occurrence within catchments was analyzed using kappa statistics of the error matrix approach, while the lengths of predicted EGs were compared with the observed dataset using the Nash-Sutcliffe Efficiency (NSE) statistics. The TI frequency analysis produced bi-modal distribution of topographic indexes with the pixels within the EG trajectory having a higher peak. The graphs of kappa and NSE versus critical TI threshold showed similar profile for all four TI models and both watersheds with the maximum value representing the best comparison with the observed data. The Compound Topographic Index (CTI) model presented the overall best accuracy with NSE of 0.55 and kappa of 0.32. The statistics for the disturbed watershed showed higher best critical TI threshold values than for the undisturbed watershed. Structural conservation practices implemented in the disturbed watershed reduced ephemeral channels in headwater catchments, thus producing less variability in catchments with EGs. The variation in critical thresholds for all TI models suggested that TI models tend to predict EG occurrence and length over a range of thresholds rather than find a single best value.
Wu, Johnny C; Gardner, David P; Ozer, Stuart; Gutell, Robin R; Ren, Pengyu
2009-08-28
The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.
Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romañach, Stephanie; Watling, James I.; Mazzotti, Frank J.
2017-01-01
Climate envelope models are widely used to describe potential future distribution of species under different climate change scenarios. It is broadly recognized that there are both strengths and limitations to using climate envelope models and that outcomes are sensitive to initial assumptions, inputs, and modeling methods Selection of predictor variables, a central step in modeling, is one of the areas where different techniques can yield varying results. Selection of climate variables to use as predictors is often done using statistical approaches that develop correlations between occurrences and climate data. These approaches have received criticism in that they rely on the statistical properties of the data rather than directly incorporating biological information about species responses to temperature and precipitation. We evaluated and compared models and prediction maps for 15 threatened or endangered species in Florida based on two variable selection techniques: expert opinion and a statistical method. We compared model performance between these two approaches for contemporary predictions, and the spatial correlation, spatial overlap and area predicted for contemporary and future climate predictions. In general, experts identified more variables as being important than the statistical method and there was low overlap in the variable sets (<40%) between the two methods Despite these differences in variable sets (expert versus statistical), models had high performance metrics (>0.9 for area under the curve (AUC) and >0.7 for true skill statistic (TSS). Spatial overlap, which compares the spatial configuration between maps constructed using the different variable selection techniques, was only moderate overall (about 60%), with a great deal of variability across species. Difference in spatial overlap was even greater under future climate projections, indicating additional divergence of model outputs from different variable selection techniques. Our work is in agreement with other studies which have found that for broad-scale species distribution modeling, using statistical methods of variable selection is a useful first step, especially when there is a need to model a large number of species or expert knowledge of the species is limited. Expert input can then be used to refine models that seem unrealistic or for species that experts believe are particularly sensitive to change. It also emphasizes the importance of using multiple models to reduce uncertainty and improve map outputs for conservation planning. Where outputs overlap or show the same direction of change there is greater certainty in the predictions. Areas of disagreement can be used for learning by asking why the models do not agree, and may highlight areas where additional on-the-ground data collection could improve the models.
Statistical Issues for Uncontrolled Reentry Hazards
NASA Technical Reports Server (NTRS)
Matney, Mark
2008-01-01
A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. This inevitably involves assumptions that simplify the problem and make it tractable, but it is often difficult to test the accuracy and applicability of these assumptions. This paper looks at a number of these theoretical assumptions, examining the mathematical basis for the hazard calculations, and outlining the conditions under which the simplifying assumptions hold. In addition, this paper will also outline some new tools for assessing ground hazard risk in useful ways. Also, this study is able to make use of a database of known uncontrolled reentry locations measured by the United States Department of Defense. By using data from objects that were in orbit more than 30 days before reentry, sufficient time is allowed for the orbital parameters to be randomized in the way the models are designed to compute. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors - including the effects of gravitational harmonics, the effects of the Earth's equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and change the ground footprints. The measured latitude and longitude distributions of these objects provide data that can be directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.
Size distribution spectrum of noninertial particles in turbulence
NASA Astrophysics Data System (ADS)
Saito, Izumi; Gotoh, Toshiyuki; Watanabe, Takeshi
2018-05-01
Collision-coalescence growth of noninertial particles in three-dimensional homogeneous isotropic turbulence is studied. Smoluchowski's coagulation equation describes the evolution of the size distribution of particles in this system. By applying a methodology based on turbulence theory, the equation is shown to have a steady-state solution, which corresponds to the Kolmogorov-type power-law spectrum. Direct numerical simulations of turbulence and Lagrangian particles are conducted. The result shows that the size distribution in a statistically steady state agrees accurately with the theoretical prediction.
Hollings, Tracey; Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark
2017-01-01
In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning.
Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark
2017-01-01
In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning. PMID:28837685
Gis-Based Spatial Statistical Analysis of College Graduates Employment
NASA Astrophysics Data System (ADS)
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Statistical Models of Fracture Relevant to Nuclear-Grade Graphite: Review and Recommendations
NASA Technical Reports Server (NTRS)
Nemeth, Noel N.; Bratton, Robert L.
2011-01-01
The nuclear-grade (low-impurity) graphite needed for the fuel element and moderator material for next-generation (Gen IV) reactors displays large scatter in strength and a nonlinear stress-strain response from damage accumulation. This response can be characterized as quasi-brittle. In this expanded review, relevant statistical failure models for various brittle and quasi-brittle material systems are discussed with regard to strength distribution, size effect, multiaxial strength, and damage accumulation. This includes descriptions of the Weibull, Batdorf, and Burchell models as well as models that describe the strength response of composite materials, which involves distributed damage. Results from lattice simulations are included for a physics-based description of material breakdown. Consideration is given to the predicted transition between brittle and quasi-brittle damage behavior versus the density of damage (level of disorder) within the material system. The literature indicates that weakest-link-based failure modeling approaches appear to be reasonably robust in that they can be applied to materials that display distributed damage, provided that the level of disorder in the material is not too large. The Weibull distribution is argued to be the most appropriate statistical distribution to model the stochastic-strength response of graphite.
Transmission overhaul and replacement predictions using Weibull and renewel theory
NASA Technical Reports Server (NTRS)
Savage, M.; Lewicki, D. G.
1989-01-01
A method to estimate the frequency of transmission overhauls is presented. This method is based on the two-parameter Weibull statistical distribution for component life. A second method is presented to estimate the number of replacement components needed to support the transmission overhaul pattern. The second method is based on renewal theory. Confidence statistics are applied with both methods to improve the statistical estimate of sample behavior. A transmission example is also presented to illustrate the use of the methods. Transmission overhaul frequency and component replacement calculations are included in the example.
Moments of inclination error distribution computer program
NASA Technical Reports Server (NTRS)
Myler, T. R.
1981-01-01
A FORTRAN coded computer program is described which calculates orbital inclination error statistics using a closed-form solution. This solution uses a data base of trajectory errors from actual flights to predict the orbital inclination error statistics. The Scott flight history data base consists of orbit insertion errors in the trajectory parameters - altitude, velocity, flight path angle, flight azimuth, latitude and longitude. The methods used to generate the error statistics are of general interest since they have other applications. Program theory, user instructions, output definitions, subroutine descriptions and detailed FORTRAN coding information are included.
2016-05-31
and included explosives such as TATP, HMTD, RDX, RDX, ammonium nitrate , potassium perchlorate, potassium nitrate , sugar, and TNT. The approach...Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2. d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2. d Bayesian and Non-parametric Statistics: Integration of Neural
Cylinders out of a top hat: counts-in-cells for projected densities
NASA Astrophysics Data System (ADS)
Uhlemann, Cora; Pichon, Christophe; Codis, Sandrine; L'Huillier, Benjamin; Kim, Juhan; Bernardeau, Francis; Park, Changbom; Prunet, Simon
2018-06-01
Large deviation statistics is implemented to predict the statistics of cosmic densities in cylinders applicable to photometric surveys. It yields few per cent accurate analytical predictions for the one-point probability distribution function (PDF) of densities in concentric or compensated cylinders; and also captures the density dependence of their angular clustering (cylinder bias). All predictions are found to be in excellent agreement with the cosmological simulation Horizon Run 4 in the quasi-linear regime where standard perturbation theory normally breaks down. These results are combined with a simple local bias model that relates dark matter and tracer densities in cylinders and validated on simulated halo catalogues. This formalism can be used to probe cosmology with existing and upcoming photometric surveys like DES, Euclid or WFIRST containing billions of galaxies.
Statistical wind analysis for near-space applications
NASA Astrophysics Data System (ADS)
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
The Statistics of Urban Scaling and Their Connection to Zipf’s Law
Gomez-Lievano, Andres; Youn, HyeJin; Bettencourt, Luís M. A.
2012-01-01
Urban scaling relations characterizing how diverse properties of cities vary on average with their population size have recently been shown to be a general quantitative property of many urban systems around the world. However, in previous studies the statistics of urban indicators were not analyzed in detail, raising important questions about the full characterization of urban properties and how scaling relations may emerge in these larger contexts. Here, we build a self-consistent statistical framework that characterizes the joint probability distributions of urban indicators and city population sizes across an urban system. To develop this framework empirically we use one of the most granular and stochastic urban indicators available, specifically measuring homicides in cities of Brazil, Colombia and Mexico, three nations with high and fast changing rates of violent crime. We use these data to derive the conditional probability of the number of homicides per year given the population size of a city. To do this we use Bayes’ rule together with the estimated conditional probability of city size given their number of homicides and the distribution of total homicides. We then show that scaling laws emerge as expectation values of these conditional statistics. Knowledge of these distributions implies, in turn, a relationship between scaling and population size distribution exponents that can be used to predict Zipf’s exponent from urban indicator statistics. Our results also suggest how a general statistical theory of urban indicators may be constructed from the stochastic dynamics of social interaction processes in cities. PMID:22815745
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
2017-01-01
Producing predictions of the probabilistic risks of operating materials for given lengths of time at stated operating conditions requires the assimilation of existing deterministic creep life prediction models (that only predict the average failure time) with statistical models that capture the random component of creep. To date, these approaches have rarely been combined to achieve this objective. The first half of this paper therefore provides a summary review of some statistical models to help bridge the gap between these two approaches. The second half of the paper illustrates one possible assimilation using 1Cr1Mo-0.25V steel. The Wilshire equation for creep life prediction is integrated into a discrete hazard based statistical model—the former being chosen because of its novelty and proven capability in accurately predicting average failure times and the latter being chosen because of its flexibility in modelling the failure time distribution. Using this model it was found that, for example, if this material had been in operation for around 15 years at 823 K and 130 MPa, the chances of failure in the next year is around 35%. However, if this material had been in operation for around 25 years, the chance of failure in the next year rises dramatically to around 80%. PMID:29039773
Mixture EMOS model for calibrating ensemble forecasts of wind speed.
Baran, S; Lerch, S
2016-03-01
Ensemble model output statistics (EMOS) is a statistical tool for post-processing forecast ensembles of weather variables obtained from multiple runs of numerical weather prediction models in order to produce calibrated predictive probability density functions. The EMOS predictive probability density function is given by a parametric distribution with parameters depending on the ensemble forecasts. We propose an EMOS model for calibrating wind speed forecasts based on weighted mixtures of truncated normal (TN) and log-normal (LN) distributions where model parameters and component weights are estimated by optimizing the values of proper scoring rules over a rolling training period. The new model is tested on wind speed forecasts of the 50 member European Centre for Medium-range Weather Forecasts ensemble, the 11 member Aire Limitée Adaptation dynamique Développement International-Hungary Ensemble Prediction System ensemble of the Hungarian Meteorological Service, and the eight-member University of Washington mesoscale ensemble, and its predictive performance is compared with that of various benchmark EMOS models based on single parametric families and combinations thereof. The results indicate improved calibration of probabilistic and accuracy of point forecasts in comparison with the raw ensemble and climatological forecasts. The mixture EMOS model significantly outperforms the TN and LN EMOS methods; moreover, it provides better calibrated forecasts than the TN-LN combination model and offers an increased flexibility while avoiding covariate selection problems. © 2016 The Authors Environmetrics Published by JohnWiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Jesenska, Sona; Liess, Mathias; Schäfer, Ralf; Beketov, Mikhail; Blaha, Ludek
2013-04-01
Species sensitivity distribution (SSD) is statistical method broadly used in the ecotoxicological risk assessment of chemicals. Originally it has been used for prospective risk assessment of single substances but nowadays it is becoming more important also in the retrospective risk assessment of mixtures, including the catchment scale. In the present work, SSD predictions (impacts of mixtures consisting of 25 pesticides; data from several catchments in Germany, France and Finland) were compared with SPEAR-pesticides, which a bioindicator index based on biological traits responsive to the effects of pesticides and post-contamination recovery. The results showed statistically significant correlations (Pearson's R, p<0.01) between SSD (predicted msPAF values) and values of SPEAR-pesticides (based on field biomonitoring observations). Comparisons of the thresholds established for the SSD and SPEAR approaches (SPEAR-pesticides=45%, i.e. LOEC level, and msPAF = 0.05 for SSD, i.e. HC5) showed that use of chronic toxicity data significantly improved the agreement between the two methods but the SPEAR-pesticides index was still more sensitive. Taken together, the validation study shows good potential of SSD models in predicting the real impacts of micropollutant mixtures on natural communities of aquatic biota.
Testing the Predictive Power of Coulomb Stress on Aftershock Sequences
NASA Astrophysics Data System (ADS)
Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.
2009-12-01
Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.
THE STATISTICAL MECHANICS OF PLANET ORBITS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tremaine, Scott, E-mail: tremaine@ias.edu
2015-07-10
The final “giant-impact” phase of terrestrial planet formation is believed to begin with a large number of planetary “embryos” on nearly circular, coplanar orbits. Mutual gravitational interactions gradually excite their eccentricities until their orbits cross and they collide and merge; through this process the number of surviving bodies declines until the system contains a small number of planets on well-separated, stable orbits. In this paper we explore a simple statistical model for the orbit distribution of planets formed by this process, based on the sheared-sheet approximation and the ansatz that the planets explore uniformly all of the stable region ofmore » phase space. The model provides analytic predictions for the distribution of eccentricities and semimajor axis differences, correlations between orbital elements of nearby planets, and the complete N-planet distribution function, in terms of a single parameter, the “dynamical temperature,” that is determined by the planetary masses. The predicted properties are generally consistent with N-body simulations of the giant-impact phase and with the distribution of semimajor axis differences in the Kepler catalog of extrasolar planets. A similar model may apply to the orbits of giant planets if these orbits are determined mainly by dynamical evolution after the planets have formed and the gas disk has disappeared.« less
Kraan, Casper; Aarts, Geert; Van der Meer, Jaap; Piersma, Theunis
2010-06-01
Ongoing statistical sophistication allows a shift from describing species' spatial distributions toward statistically disentangling the possible roles of environmental variables in shaping species distributions. Based on a landscape-scale benthic survey in the Dutch Wadden Sea, we show the merits of spatially explicit generalized estimating equations (GEE). The intertidal macrozoobenthic species, Macoma balthica, Cerastoderma edule, Marenzelleria viridis, Scoloplos armiger, Corophium volutator, and Urothoe poseidonis served as test cases, with median grain-size and inundation time as typical environmental explanatory variables. GEEs outperformed spatially naive generalized linear models (GLMs), and removed much residual spatial structure, indicating the importance of median grain-size and inundation time in shaping landscape-scale species distributions in the intertidal. GEE regression coefficients were smaller than those attained with GLM, and GEE standard errors were larger. The best fitting GEE for each species was used to predict species' density in relation to median grain-size and inundation time. Although no drastic changes were noted compared to previous work that described habitat suitability for benthic fauna in the Wadden Sea, our predictions provided more detailed and unbiased estimates of the determinants of species-environment relationships. We conclude that spatial GEEs offer the necessary methodological advances to further steps toward linking pattern to process.
Groen, Iris I. A.; Ghebreab, Sennay; Lamme, Victor A. F.; Scholte, H. Steven
2012-01-01
The visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task. PMID:23093921
NASA Astrophysics Data System (ADS)
Kumar, Jagadish; Ananthakrishna, G.
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
Kumar, Jagadish; Ananthakrishna, G
2018-01-01
Scale-invariant power-law distributions for acoustic emission signals are ubiquitous in several plastically deforming materials. However, power-law distributions for acoustic emission energies are reported in distinctly different plastically deforming situations such as hcp and fcc single and polycrystalline samples exhibiting smooth stress-strain curves and in dilute metallic alloys exhibiting discontinuous flow. This is surprising since the underlying dislocation mechanisms in these two types of deformations are very different. So far, there have been no models that predict the power-law statistics for discontinuous flow. Furthermore, the statistics of the acoustic emission signals in jerky flow is even more complex, requiring multifractal measures for a proper characterization. There has been no model that explains the complex statistics either. Here we address the problem of statistical characterization of the acoustic emission signals associated with the three types of the Portevin-Le Chatelier bands. Following our recently proposed general framework for calculating acoustic emission, we set up a wave equation for the elastic degrees of freedom with a plastic strain rate as a source term. The energy dissipated during acoustic emission is represented by the Rayleigh-dissipation function. Using the plastic strain rate obtained from the Ananthakrishna model for the Portevin-Le Chatelier effect, we compute the acoustic emission signals associated with the three Portevin-Le Chatelier bands and the Lüders-like band. The so-calculated acoustic emission signals are used for further statistical characterization. Our results show that the model predicts power-law statistics for all the acoustic emission signals associated with the three types of Portevin-Le Chatelier bands with the exponent values increasing with increasing strain rate. The calculated multifractal spectra corresponding to the acoustic emission signals associated with the three band types have a maximum spread for the type C bands and decreasing with types B and A. We further show that the acoustic emission signals associated with Lüders-like band also exhibit a power-law distribution and multifractality.
NASA Astrophysics Data System (ADS)
Lee, Richard; Chan, Elisa K.; Kosztyla, Robert; Liu, Mitchell; Moiseenko, Vitali
2012-12-01
The relationship between rectal dose distribution and the incidence of late rectal complications following external-beam radiotherapy has been previously studied using dose-volume histograms or dose-surface histograms. However, they do not account for the spatial dose distribution. This study proposes a metric based on both surface dose and distance that can predict the incidence of rectal bleeding in prostate cancer patients treated with radical radiotherapy. One hundred and forty-four patients treated with radical radiotherapy for prostate cancer were prospectively followed to record the incidence of grade ≥2 rectal bleeding. Radiotherapy plans were used to evaluate a dose-distance metric that accounts for the dose and its spatial distribution on the rectal surface, characterized by a logistic weighting function with slope a and inflection point d0. This was compared to the effective dose obtained from dose-surface histograms, characterized by the parameter n which describes sensitivity to hot spots. The log-rank test was used to determine statistically significant (p < 0.05) cut-off values for the dose-distance metric and effective dose that predict for the occurrence of rectal bleeding. For the dose-distance metric, only d0 = 25 and 30 mm combined with a > 5 led to statistical significant cut-offs. For the effective dose metric, only values of n in the range 0.07-0.35 led to statistically significant cut-offs. The proposed dose-distance metric is a predictor of rectal bleeding in prostate cancer patients treated with radiotherapy. Both the dose-distance metric and the effective dose metric indicate that the incidence of grade ≥2 rectal bleeding is sensitive to localized damage to the rectal surface.
NASA Technical Reports Server (NTRS)
Nguyen, Andrew; Gole, Alexander; Randall, Jarom; Dlott, Glade; Zhang, Sylvia; Alfaro, Brian; Schmidt, Cindy; Skiles, J. W.
2011-01-01
Mapping and predicting the spatial distribution of invasive plant species is central to habitat management, however difficult to implement at landscape and regional scales. Remote sensing techniques can reduce the impact field campaigns have on these ecologically sensitive areas and can provide a regional and multi-temporal view of invasive species spread. Invasive perennial pepperweed (Lepidium latifolium) is now widespread in fragmented estuaries of the South San Francisco Bay, and is shown to degrade native vegetation in estuaries and adjacent habitats, thereby reducing forage and shelter for wildlife. The purpose of this study is to map the present distribution of pepperweed in estuarine areas of the South San Francisco Bay Salt Pond Restoration Project (Alviso, CA), and create a habitat suitability model to predict future spread. Pepperweed reflectance data were collected in-situ with a GER 1500 spectroradiometer along with 88 corresponding pepperweed presence and absence points used for building the statistical models. The spectral angle mapper (SAM) classification algorithm was used to distinguish the reflectance spectrum of pepperweed and map its distribution using an image from EO-1 Hyperion. To map pepperweed, we performed a supervised classification on an ASTER image with a resulting classification accuracy of 71.8%. We generated a weighted overlay analysis model within a geographic information system (GIS) framework to predict areas in the study site most susceptible to pepperweed colonization. Variables for the model included propensity for disturbance, status of pond restoration, proximity to water channels, and terrain curvature. A Generalized Additive Model (GAM) was also used to generate a probability map and investigate the statistical probability that each variable contributed to predict pepperweed spread. Results from the GAM revealed distance to channels, distance to ponds and curvature were statistically significant (p < 0.01) in determining the locations of suitable pepperweed habitats.
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.
Bannan, Caitlin C; Burley, Kalistyn H; Chiu, Michael; Shirts, Michael R; Gilson, Michael K; Mobley, David L
2016-11-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty-how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge
Bannan, Caitlin C.; Burley, Kalistyn H.; Chiu, Michael; Shirts, Michael R.; Gilson, Michael K.; Mobley, David L.
2016-01-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty – how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided. PMID:27677750
Piernas Sánchez, C M; Morales Falo, E M; Zamora Navarro, S; Garaulet Aza, M
2010-01-01
The excess of visceral abdominal adipose tissue is one of the major concerns in obesity and its clinical treatment. To apply the two-dimensional predictive equation proposed by Garaulet et al. to determine the abdominal fat distribution and to compare the results with the body composition obtained by multi-frequency bioelectrical impedance analysis (M-BIA). We studied 230 women, who underwent anthropometry and M-BIA. The predictive equation was applied. Multivariate lineal and partial correlation analyses were performed with control for BMI and % body fat, using SPSS 15.0 with statistical significance P < 0.05. Overall, women were considered as having subcutaneous distribution of abdominal fat. Truncal fat, regional fat and muscular mass were negatively associated with VA/SA(predicted), while the visceral index obtained by M-BIA was positively correlated with VA/SA(predicted). The predictive equation may be useful in the clinical practice to obtain an accurate, costless and safe classification of abdominal obesity.
NASA Astrophysics Data System (ADS)
Zhang, Yonggen; Schaap, Marcel G.
2017-04-01
Pedotransfer functions (PTFs) have been widely used to predict soil hydraulic parameters in favor of expensive laboratory or field measurements. Rosetta (Schaap et al., 2001, denoted as Rosetta1) is one of many PTFs and is based on artificial neural network (ANN) analysis coupled with the bootstrap re-sampling method which allows the estimation of van Genuchten water retention parameters (van Genuchten, 1980, abbreviated here as VG), saturated hydraulic conductivity (Ks), and their uncertainties. In this study, we present an improved set of hierarchical pedotransfer functions (Rosetta3) that unify the water retention and Ks submodels into one. Parameter uncertainty of the fit of the VG curve to the original retention data is used in the ANN calibration procedure to reduce bias of parameters predicted by the new PTF. One thousand bootstrap replicas were used to calibrate the new models compared to 60 or 100 in Rosetta1, thus allowing the uni-variate and bi-variate probability distributions of predicted parameters to be quantified in greater detail. We determined the optimal weights for VG parameters and Ks, the optimal number of hidden nodes in ANN, and the number of bootstrap replicas required for statistically stable estimates. Results show that matric potential-dependent bias was reduced significantly while root mean square error (RMSE) for water content were reduced modestly; RMSE for Ks was increased by 0.9% (H3w) to 3.3% (H5w) in the new models on log scale of Ks compared with the Rosetta1 model. It was found that estimated distributions of parameters were mildly non-Gaussian and could instead be described rather well with heavy-tailed α-stable distributions. On the other hand, arithmetic means had only a small estimation bias for most textures when compared with the mean-like "shift" parameter of the α-stable distributions. Arithmetic means and (co-)variances are therefore still recommended as summary statistics of the estimated distributions. However, it may be necessary to parameterize the distributions in different ways if the new estimates are used in stochastic analyses of vadose zone flow and transport. Rosetta1 and Posetta3 were implemented in the python programming language, and the source code as well as additional documentation is available at: http://www.cals.arizona.edu/research/rosettav3.html.
NASA Astrophysics Data System (ADS)
Zhang, Pei; Barlow, Robert; Masri, Assaad; Wang, Haifeng
2016-11-01
The mixture fraction and progress variable are often used as independent variables for describing turbulent premixed and non-premixed flames. There is a growing interest in using these two variables for describing partially premixed flames. The joint statistical distribution of the mixture fraction and progress variable is of great interest in developing models for partially premixed flames. In this work, we conduct predictive studies of the joint statistics of mixture fraction and progress variable in a series of piloted methane jet flames with inhomogeneous inlet flows. The employed models combine large eddy simulations with the Monte Carlo probability density function (PDF) method. The joint PDFs and marginal PDFs are examined in detail by comparing the model predictions and the measurements. Different presumed shapes of the joint PDFs are also evaluated.
Statistical distributions of extreme dry spell in Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Zin, Wan Zawiah Wan; Jemain, Abdul Aziz
2010-11-01
Statistical distributions of annual extreme (AE) series and partial duration (PD) series for dry-spell event are analyzed for a database of daily rainfall records of 50 rain-gauge stations in Peninsular Malaysia, with recording period extending from 1975 to 2004. The three-parameter generalized extreme value (GEV) and generalized Pareto (GP) distributions are considered to model both series. In both cases, the parameters of these two distributions are fitted by means of the L-moments method, which provides a robust estimation of them. The goodness-of-fit (GOF) between empirical data and theoretical distributions are then evaluated by means of the L-moment ratio diagram and several goodness-of-fit tests for each of the 50 stations. It is found that for the majority of stations, the AE and PD series are well fitted by the GEV and GP models, respectively. Based on the models that have been identified, we can reasonably predict the risks associated with extreme dry spells for various return periods.
Eddy, Sean R.
2008-01-01
Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236
Results on three predictions for July 2012 federal elections in Mexico based on past regularities.
Hernández-Saldaña, H
2013-01-01
The Presidential Election in Mexico of July 2012 has been the third time that PREP, Previous Electoral Results Program works. PREP gives voting outcomes based in electoral certificates of each polling station that arrive to capture centers. In previous ones, some statistical regularities had been observed, three of them were selected to make predictions and were published in arXiv:1207.0078 [physics.soc-ph]. Using the database made public in July 2012, two of the predictions were completely fulfilled, while, the third one was measured and confirmed using the database obtained upon request to the electoral authorities. The first two predictions confirmed by actual measures are: (ii) The Partido Revolucionario Institucional, PRI, is a sprinter and has a better performance in polling stations arriving late to capture centers during the process. (iii) Distribution of vote of this party is well described by a smooth function named a Daisy model. A Gamma distribution, but compatible with a Daisy model, fits the distribution as well. The third prediction confirms that errare humanum est, since the error distributions of all the self-consistency variables appeared as a central power law with lateral lobes as in 2000 and 2006 electoral processes. The three measured regularities appeared no matter the political environment.
Results on Three Predictions for July 2012 Federal Elections in Mexico Based on Past Regularities
Hernández-Saldaña, H.
2013-01-01
The Presidential Election in Mexico of July 2012 has been the third time that PREP, Previous Electoral Results Program works. PREP gives voting outcomes based in electoral certificates of each polling station that arrive to capture centers. In previous ones, some statistical regularities had been observed, three of them were selected to make predictions and were published in arXiv:1207.0078 [physics.soc-ph]. Using the database made public in July 2012, two of the predictions were completely fulfilled, while, the third one was measured and confirmed using the database obtained upon request to the electoral authorities. The first two predictions confirmed by actual measures are: (ii) The Partido Revolucionario Institucional, PRI, is a sprinter and has a better performance in polling stations arriving late to capture centers during the process. (iii) Distribution of vote of this party is well described by a smooth function named a Daisy model. A Gamma distribution, but compatible with a Daisy model, fits the distribution as well. The third prediction confirms that errare humanum est, since the error distributions of all the self-consistency variables appeared as a central power law with lateral lobes as in 2000 and 2006 electoral processes. The three measured regularities appeared no matter the political environment. PMID:24386103
NASA Astrophysics Data System (ADS)
Rings, Joerg; Vrugt, Jasper A.; Schoups, Gerrit; Huisman, Johan A.; Vereecken, Harry
2012-05-01
Bayesian model averaging (BMA) is a standard method for combining predictive distributions from different models. In recent years, this method has enjoyed widespread application and use in many fields of study to improve the spread-skill relationship of forecast ensembles. The BMA predictive probability density function (pdf) of any quantity of interest is a weighted average of pdfs centered around the individual (possibly bias-corrected) forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts, and reflect the individual models skill over a training (calibration) period. The original BMA approach presented by Raftery et al. (2005) assumes that the conditional pdf of each individual model is adequately described with a rather standard Gaussian or Gamma statistical distribution, possibly with a heteroscedastic variance. Here we analyze the advantages of using BMA with a flexible representation of the conditional pdf. A joint particle filtering and Gaussian mixture modeling framework is presented to derive analytically, as closely and consistently as possible, the evolving forecast density (conditional pdf) of each constituent ensemble member. The median forecasts and evolving conditional pdfs of the constituent models are subsequently combined using BMA to derive one overall predictive distribution. This paper introduces the theory and concepts of this new ensemble postprocessing method, and demonstrates its usefulness and applicability by numerical simulation of the rainfall-runoff transformation using discharge data from three different catchments in the contiguous United States. The revised BMA method receives significantly lower-prediction errors than the original default BMA method (due to filtering) with predictive uncertainty intervals that are substantially smaller but still statistically coherent (due to the use of a time-variant conditional pdf).
Effect of Individual Component Life Distribution on Engine Life Prediction
NASA Technical Reports Server (NTRS)
Zaretsky, Erwin V.; Hendricks, Robert C.; Soditus, Sherry M.
2003-01-01
The effect of individual engine component life distributions on engine life prediction was determined. A Weibull-based life and reliability analysis of the NASA Energy Efficient Engine was conducted. The engine s life at a 95 and 99.9 percent probability of survival was determined based upon the engine manufacturer s original life calculations and assumed values of each of the component s cumulative life distributions as represented by a Weibull slope. The lives of the high-pressure turbine (HPT) disks and blades were also evaluated individually and as a system in a similar manner. Knowing the statistical cumulative distribution of each engine component with reasonable engineering certainty is a condition precedent to predicting the life and reliability of an entire engine. The life of a system at a given reliability will be less than the lowest-lived component in the system at the same reliability (probability of survival). Where Weibull slopes of all the engine components are equal, the Weibull slope had a minimal effect on engine L(sub 0.1) life prediction. However, at a probability of survival of 95 percent (L(sub 5) life), life decreased with increasing Weibull slope.
Epping, Ruben; Panne, Ulrich; Falkenhagen, Jana
2017-02-07
Statistical ethylene oxide (EO) and propylene oxide (PO) copolymers of different monomer compositions and different average molar masses additionally containing two kinds of end groups (FTD) were investigated by ultra high pressure liquid chromatography under critical conditions (UP-LCCC) combined with electrospray ionization time-of flight mass spectrometry (ESI-TOF-MS). Theoretical predictions of the existence of a critical adsorption point (CPA) for statistical copolymers with a given chemical and sequence distribution1 could be studied and confirmed. A fundamentally new approach to determine these critical conditions in a copolymer, alongside the inevitable chemical composition distribution (CCD), with mass spectrometric detection, is described. The shift of the critical eluent composition with the monomer composition of the polymers was determined. Due to the broad molar mass distribution (MMD) and the presumed existence of different end group functionalities as well as monomer sequence distribution (MSD), gradient separation only by CCD was not possible. Therefore, isocratic separation conditions at the CPA of definite CCD fractions were developed. Although the various present distributions partly superimposed the separation process, the goal of separation by end group functionality was still achieved on the basis of the additional dimension of ESI-TOF-MS. The existence of HO-H besides the desired allylO-H end group functionalities was confirmed and their amount estimated. Furthermore, indications for a MSD were found by UPLC/MS/MS measurements. This approach offers for the first time the possibility to obtain a fingerprint of a broad distributed statistical copolymer including MMD, FTD, CCD, and MSD.
Statistical properties of cross-correlation in the Korean stock market
NASA Astrophysics Data System (ADS)
Oh, G.; Eom, C.; Wang, F.; Jung, W.-S.; Stanley, H. E.; Kim, S.
2011-01-01
We investigate the statistical properties of the cross-correlation matrix between individual stocks traded in the Korean stock market using the random matrix theory (RMT) and observe how these affect the portfolio weights in the Markowitz portfolio theory. We find that the distribution of the cross-correlation matrix is positively skewed and changes over time. We find that the eigenvalue distribution of original cross-correlation matrix deviates from the eigenvalues predicted by the RMT, and the largest eigenvalue is 52 times larger than the maximum value among the eigenvalues predicted by the RMT. The β_{473} coefficient, which reflect the largest eigenvalue property, is 0.8, while one of the eigenvalues in the RMT is approximately zero. Notably, we show that the entropy function E(σ) with the portfolio risk σ for the original and filtered cross-correlation matrices are consistent with a power-law function, E( σ) σ^{-γ}, with the exponent γ 2.92 and those for Asian currency crisis decreases significantly.
UVPROM dosimetry, microdosimetry and applications to SEU and extreme value theory
NASA Astrophysics Data System (ADS)
Scheick, Leif Zebediah
A new method is described for characterizing a device in terms of the statistical distribution of first failures. The method is based on the erasure of a commercial Ultra- Violet erasable Programmable Read Only Memory (UVPROM). The method of readout would be used on a spacecraft or in other restrictive radiation environments. The measurement of the charge remaining on the floating gate is used to determine absorbed dose. The method of determining dose does not require the detector to be destroyed or erased nor does it effect the ability for taking further measurements. This is compared to extreme value theory applied to the statistical distributions that apply to this device. This technique predicts the threshold of Single Event Effects (SEE), like anomalous changes in erasure time in programmable devices due to high microdose energy-deposition events. This technique also allows for advanced non-destructive, screening of a single microelectronic devices for predictable response in a stressful, i.e. radiation, environments.
Mitra, Rajib; Jordan, Michael I.; Dunbrack, Roland L.
2010-01-01
Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp. PMID:20442867
A statistical model for radar images of agricultural scenes
NASA Technical Reports Server (NTRS)
Frost, V. S.; Shanmugan, K. S.; Holtzman, J. C.; Stiles, J. A.
1982-01-01
The presently derived and validated statistical model for radar images containing many different homogeneous fields predicts the probability density functions of radar images of entire agricultural scenes, thereby allowing histograms of large scenes composed of a variety of crops to be described. Seasat-A SAR images of agricultural scenes are accurately predicted by the model on the basis of three assumptions: each field has the same SNR, all target classes cover approximately the same area, and the true reflectivity characterizing each individual target class is a uniformly distributed random variable. The model is expected to be useful in the design of data processing algorithms and for scene analysis using radar images.
Establishing the kinetics of ballistic-to-diffusive transition using directional statistics
NASA Astrophysics Data System (ADS)
Liu, Pai; Heinson, William R.; Sumlin, Benjamin J.; Shen, Kuan-Yu; Chakrabarty, Rajan K.
2018-04-01
We establish the kinetics of ballistic-to-diffusive (BD) transition observed in two-dimensional random walk using directional statistics. Directional correlation is parameterized using the walker's turning angle distribution, which follows the commonly adopted wrapped Cauchy distribution (WCD) function. During the BD transition, the concentration factor (ρ) governing the WCD shape is observed to decrease from its initial value. We next analytically derive the relationship between effective ρ and time, which essentially quantifies the BD transition rate. The prediction of our kinetic expression agrees well with the empirical datasets obtained from correlated random walk simulation. We further connect our formulation with the conventionally used scaling relationship between the walker's mean-square displacement and time.
Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu
2015-09-01
Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages.
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure-regularities arising in an ordered series of syllable timings-testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint.
Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages
Jadoul, Yannick; Ravignani, Andrea; Thompson, Bill; Filippi, Piera; de Boer, Bart
2016-01-01
Temporal regularities in speech, such as interdependencies in the timing of speech events, are thought to scaffold early acquisition of the building blocks in speech. By providing on-line clues to the location and duration of upcoming syllables, temporal structure may aid segmentation and clustering of continuous speech into separable units. This hypothesis tacitly assumes that learners exploit predictability in the temporal structure of speech. Existing measures of speech timing tend to focus on first-order regularities among adjacent units, and are overly sensitive to idiosyncrasies in the data they describe. Here, we compare several statistical methods on a sample of 18 languages, testing whether syllable occurrence is predictable over time. Rather than looking for differences between languages, we aim to find across languages (using clearly defined acoustic, rather than orthographic, measures), temporal predictability in the speech signal which could be exploited by a language learner. First, we analyse distributional regularities using two novel techniques: a Bayesian ideal learner analysis, and a simple distributional measure. Second, we model higher-order temporal structure—regularities arising in an ordered series of syllable timings—testing the hypothesis that non-adjacent temporal structures may explain the gap between subjectively-perceived temporal regularities, and the absence of universally-accepted lower-order objective measures. Together, our analyses provide limited evidence for predictability at different time scales, though higher-order predictability is difficult to reliably infer. We conclude that temporal predictability in speech may well arise from a combination of individually weak perceptual cues at multiple structural levels, but is challenging to pinpoint. PMID:27994544
Radiative PQ breaking and the Higgs boson mass
NASA Astrophysics Data System (ADS)
D'Eramo, Francesco; Hall, Lawrence J.; Pappadopulo, Duccio
2015-06-01
The small and negative value of the Standard Model Higgs quartic coupling at high scales can be understood in terms of anthropic selection on a landscape where large and negative values are favored: most universes have a very short-lived electroweak vacuum and typical observers are in universes close to the corresponding metastability boundary. We provide a simple example of such a landscape with a Peccei-Quinn symmetry breaking scale generated through dimensional transmutation and supersymmetry softly broken at an intermediate scale. Large and negative contributions to the Higgs quartic are typically generated on integrating out the saxion field. Cancellations among these contributions are forced by the anthropic requirement of a sufficiently long-lived electroweak vacuum, determining the multiverse distribution for the Higgs quartic in a similar way to that of the cosmological constant. This leads to a statistical prediction of the Higgs boson mass that, for a wide range of parameters, yields the observed value within the 1σ statistical uncertainty of ˜ 5 GeV originating from the multiverse distribution. The strong CP problem is solved and single-component axion dark matter is predicted, with an abundance that can be understood from environmental selection. A more general setting for the Higgs mass prediction is discussed.
NASA Astrophysics Data System (ADS)
Olsen, S.; Zaliapin, I.
2008-12-01
We establish positive correlation between the local spatio-temporal fluctuations of the earthquake magnitude distribution and the occurrence of regional earthquakes. In order to accomplish this goal, we develop a sequential Bayesian statistical estimation framework for the b-value (slope of the Gutenberg-Richter's exponential approximation to the observed magnitude distribution) and for the ratio a(t) between the earthquake intensities in two non-overlapping magnitude intervals. The time-dependent dynamics of these parameters is analyzed using Markov Chain Models (MCM). The main advantage of this approach over the traditional window-based estimation is its "soft" parameterization, which allows one to obtain stable results with realistically small samples. We furthermore discuss a statistical methodology for establishing lagged correlations between continuous and point processes. The developed methods are applied to the observed seismicity of California, Nevada, and Japan on different temporal and spatial scales. We report an oscillatory dynamics of the estimated parameters, and find that the detected oscillations are positively correlated with the occurrence of large regional earthquakes, as well as with small events with magnitudes as low as 2.5. The reported results have important implications for further development of earthquake prediction and seismic hazard assessment methods.
Spectral unfolding of fast neutron energy distributions
NASA Astrophysics Data System (ADS)
Mosby, Michelle; Jackman, Kevin; Engle, Jonathan
2015-10-01
The characterization of the energy distribution of a neutron flux is difficult in experiments with constrained geometry where techniques such as time of flight cannot be used to resolve the distribution. The measurement of neutron fluxes in reactors, which often present similar challenges, has been accomplished using radioactivation foils as an indirect probe. Spectral unfolding codes use statistical methods to adjust MCNP predictions of neutron energy distributions using quantified radioactive residuals produced in these foils. We have applied a modification of this established neutron flux characterization technique to experimentally characterize the neutron flux in the critical assemblies at the Nevada National Security Site (NNSS) and the spallation neutron flux at the Isotope Production Facility (IPF) at Los Alamos National Laboratory (LANL). Results of the unfolding procedure are presented and compared with a priori MCNP predictions, and the implications for measurements using the neutron fluxes at these facilities are discussed.
NASA Astrophysics Data System (ADS)
Huang, Haiyun; Zhang, Junping; Li, Yonghe
2018-05-01
Under the weight charge policy, the weigh in motion data at a toll station on the Jing-Zhu Expressway were collected. The statistic analysis of vehicle load data was carried out. For calculating the operating vehicle load effects on bridges, by Monte Carlo method used to generate random traffic flow and influence line loading method, the maximum bending moment effect of simple supported beams were obtained. The extreme value I distribution and normal distribution were used to simulate the distribution of the maximum bending moment effect. By the extrapolation of Rice formula and the extreme value I distribution, the predicted values of the maximum load effects were obtained. By comparing with vehicle load effect according to current specification, some references were provided for the management of the operating vehicles and the revision of the bridge specifications.
Pooler, P.S.; Smith, D.R.
2005-01-01
We compared the ability of simple random sampling (SRS) and a variety of systematic sampling (SYS) designs to estimate abundance, quantify spatial clustering, and predict spatial distribution of freshwater mussels. Sampling simulations were conducted using data obtained from a census of freshwater mussels in a 40 X 33 m section of the Cacapon River near Capon Bridge, West Virginia, and from a simulated spatially random population generated to have the same abundance as the real population. Sampling units that were 0.25 m 2 gave more accurate and precise abundance estimates and generally better spatial predictions than 1-m2 sampling units. Systematic sampling with ???2 random starts was more efficient than SRS. Estimates of abundance based on SYS were more accurate when the distance between sampling units across the stream was less than or equal to the distance between sampling units along the stream. Three measures for quantifying spatial clustering were examined: Hopkins Statistic, the Clumping Index, and Morisita's Index. Morisita's Index was the most reliable, and the Hopkins Statistic was prone to false rejection of complete spatial randomness. SYS designs with units spaced equally across and up stream provided the most accurate predictions when estimating the spatial distribution by kriging. Our research indicates that SYS designs with sampling units equally spaced both across and along the stream would be appropriate for sampling freshwater mussels even if no information about the true underlying spatial distribution of the population were available to guide the design choice. ?? 2005 by The North American Benthological Society.
Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu
2015-06-01
Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.
Lin, Chih-Tin; Meyhofer, Edgar; Kurabayashi, Katsuo
2010-01-01
Directional control of microtubule shuttles via microfabricated tracks is key to the development of controlled nanoscale mass transport by kinesin motor molecules. Here we develop and test a model to quantitatively predict the stochastic behavior of microtubule guiding when they mechanically collide with the sidewalls of lithographically patterned tracks. By taking into account appropriate probability distributions of microscopic states of the microtubule system, the model allows us to theoretically analyze the roles of collision conditions and kinesin surface densities in determining how the motion of microtubule shuttles is controlled. In addition, we experimentally observe the statistics of microtubule collision events and compare our theoretical prediction with experimental data to validate our model. The model will direct the design of future hybrid nanotechnology devices that integrate nanoscale transport systems powered by kinesin-driven molecular shuttles.
Burst Statistics Using the Lag-Luminosity Relationship
NASA Technical Reports Server (NTRS)
Band, D. L.; Norris, J. P.; Bonnell, J. T.
2003-01-01
Using the lag-luminosity relation and various BATSE catalogs we create a large catalog of burst redshifts, peak luminosities and emitted energies. These catalogs permit us to evaluate the lag-luminosity relation, and to study the burst energy distribution. We find that this distribution can be described as a power law with an index of alpha = 1.76 +/- 0.05 (95% confidence), close to the alpha = 2 predicted by the original quasi-universal jet model.
SGR-like behaviour of the repeating FRB 121102
NASA Astrophysics Data System (ADS)
Wang, F. Y.; Yu, H.
2017-03-01
Fast radio bursts (FRBs) are millisecond-duration radio signals occurring at cosmological distances. However the physical model of FRBs is mystery, many models have been proposed. Here we study the frequency distributions of peak flux, fluence, duration and waiting time for the repeating FRB 121102. The cumulative distributions of peak flux, fluence and duration show power-law forms. The waiting time distribution also shows power-law distribution, and is consistent with a non-stationary Poisson process. These distributions are similar as those of soft gamma repeaters (SGRs). We also use the statistical results to test the proposed models for FRBs. These distributions are consistent with the predictions from avalanche models of slowly driven nonlinear dissipative systems.
Optimal predictions in everyday cognition: the wisdom of individuals or crowds?
Mozer, Michael C; Pashler, Harold; Homaei, Hadjar
2008-10-01
Griffiths and Tenenbaum (2006) asked individuals to make predictions about the duration or extent of everyday events (e.g., cake baking times), and reported that predictions were optimal, employing Bayesian inference based on veridical prior distributions. Although the predictions conformed strikingly to statistics of the world, they reflect averages over many individuals. On the conjecture that the accuracy of the group response is chiefly a consequence of aggregating across individuals, we constructed simple, heuristic approximations to the Bayesian model premised on the hypothesis that individuals have access merely to a sample of k instances drawn from the relevant distribution. The accuracy of the group response reported by Griffiths and Tenenbaum could be accounted for by supposing that individuals each utilize only two instances. Moreover, the variability of the group data is more consistent with this small-sample hypothesis than with the hypothesis that people utilize veridical or nearly veridical representations of the underlying prior distributions. Our analyses lead to a qualitatively different view of how individuals reason from past experience than the view espoused by Griffiths and Tenenbaum. 2008 Cognitive Science Society, Inc.
Luo, Mei; Wang, Hao; Lyu, Zhi
2017-12-01
Species distribution models (SDMs) are widely used by researchers and conservationists. Results of prediction from different models vary significantly, which makes users feel difficult in selecting models. In this study, we evaluated the performance of two commonly used SDMs, the Biomod2 and Maximum Entropy (MaxEnt), with real presence/absence data of giant panda, and used three indicators, i.e., area under the ROC curve (AUC), true skill statistics (TSS), and Cohen's Kappa, to evaluate the accuracy of the two model predictions. The results showed that both models could produce accurate predictions with adequate occurrence inputs and simulation repeats. Comparedto MaxEnt, Biomod2 made more accurate prediction, especially when occurrence inputs were few. However, Biomod2 was more difficult to be applied, required longer running time, and had less data processing capability. To choose the right models, users should refer to the error requirements of their objectives. MaxEnt should be considered if the error requirement was clear and both models could achieve, otherwise, we recommend the use of Biomod2 as much as possible.
Highway runoff quality models for the protection of environmentally sensitive areas
NASA Astrophysics Data System (ADS)
Trenouth, William R.; Gharabaghi, Bahram
2016-11-01
This paper presents novel highway runoff quality models using artificial neural networks (ANN) which take into account site-specific highway traffic and seasonal storm event meteorological factors to predict the event mean concentration (EMC) statistics and mean daily unit area load (MDUAL) statistics of common highway pollutants for the design of roadside ditch treatment systems (RDTS) to protect sensitive receiving environs. A dataset of 940 monitored highway runoff events from fourteen sites located in five countries (Canada, USA, Australia, New Zealand, and China) was compiled and used to develop ANN models for the prediction of highway runoff suspended solids (TSS) seasonal EMC statistical distribution parameters, as well as the MDUAL statistics for four different heavy metal species (Cu, Zn, Cr and Pb). TSS EMCs are needed to estimate the minimum required removal efficiency of the RDTS needed in order to improve highway runoff quality to meet applicable standards and MDUALs are needed to calculate the minimum required capacity of the RDTS to ensure performance longevity.
DEVELOPMENT OF RESIDENTIAL WOOD COMSUMPTION ESTIMATION MODELS
The report gives data on the distribution and usage of firewood, obtained from a pool of household wood use surveys. ased on a series of regression models developed using the STEPWISE procedure in the SAS statistical package, two variables appear to be most predictive of wood use...
Statistics of particle time-temperature histories.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hewson, John C.; Lignell, David O.; Sun, Guangyuan
2014-10-01
Particles in non - isothermal turbulent flow are subject to a stochastic environment tha t produces a distribution of particle time - temperature histories. This distribution is a function of the dispersion of the non - isothermal (continuous) gas phase and the distribution of particles relative to that gas phase. In this work we extend the one - dimensional turbulence (ODT) model to predict the joint dispersion of a dispersed particle phase and a continuous phase. The ODT model predicts the turbulent evolution of continuous scalar fields with a model for the cascade of fluctuations to smaller sc ales (themore » 'triplet map') at a rate that is a function of the fully resolved one - dimens ional velocity field . Stochastic triplet maps also drive Lagrangian particle dispersion with finite Stokes number s including inertial and eddy trajectory - crossing effect s included. Two distinct approaches to this coupling between triplet maps and particle dispersion are developed and implemented along with a hybrid approach. An 'instantaneous' particle displacement model matches the tracer particle limit and provide s an accurate description of particle dispersion. A 'continuous' particle displacement m odel translates triplet maps into a continuous velocity field to which particles respond. Particles can alter the turbulence, and modifications to the stochastic rate expr ession are developed for two - way coupling between particles and the continuous phase. Each aspect of model development is evaluated in canonical flows (homogeneous turbulence, free - shear flows and wall - bounded flows) for which quality measurements are ava ilable. ODT simulations of non - isothermal flows provide statistics for particle heating. These simulations show the significance of accurately predicting the joint statistics of particle and fluid dispersion . Inhomogeneous turbulence coupled with the in fluence of the mean flow fields on particles of varying properties alter s particle dispersion. The joint particle - temperature dispersion leads to a distribution of temperature histories predicted by the ODT . Predictions are shown for the lower moments an d the full distributions of the particle positions, particle - observed gas temperatures and particle temperatures. An analysis of the time scales affecting particle - temperature interactions covers Lagrangian integral time scales based on temperature autoco rrelations, rates of temperature change associated with particle motion relative to the temperature field and rates of diffusional change of temperatures. These latter two time scales have not been investigated previously; they are shown to be strongly in termittent having peaked distributions with long tails. The logarithm of the absolute value of these time scales exhibits a distribution closer to normal. A cknowledgements This work is supported by the Defense Threat Reduction Agency (DTRA) under their Counter - Weapons of Mass Destruction Basic Research Program in the area of Chemical and Biological Agent Defeat under award number HDTRA1 - 11 - 4503I to Sandia National Laboratories. The authors would like to express their appreciation for the guidance provi ded by Dr. Suhithi Peiris to this project and to the Science to Defeat Weapons of Mass Destruction program.« less
Near-surface wind speed statistical distribution: comparison between ECMWF System 4 and ERA-Interim
NASA Astrophysics Data System (ADS)
Marcos, Raül; Gonzalez-Reviriego, Nube; Torralba, Verónica; Cortesi, Nicola; Young, Doo; Doblas-Reyes, Francisco J.
2017-04-01
In the framework of seasonal forecast verification, knowing whether the characteristics of the climatological wind speed distribution, simulated by the forecasting systems, are similar to the observed ones is essential to guide the subsequent process of bias adjustment. To bring some light about this topic, this work assesses the properties of the statistical distributions of 10m wind speed from both ERA-Interim reanalysis and seasonal forecasts of ECMWF system 4. The 10m wind speed distribution has been characterized in terms of the four main moments of the probability distribution (mean, standard deviation, skewness and kurtosis) together with the coefficient of variation and goodness of fit Shapiro-Wilks test, allowing the identification of regions with higher wind variability and non-Gaussian behaviour at monthly time-scales. Also, the comparison of the predicted and observed 10m wind speed distributions has been measured considering both inter-annual and intra-seasonal variability. Such a comparison is important in both climate research and climate services communities because it provides useful climate information for decision-making processes and wind industry applications.
Detecting failure of climate predictions
Runge, Michael C.; Stroeve, Julienne C.; Barrett, Andrew P.; McDonald-Madden, Eve
2016-01-01
The practical consequences of climate change challenge society to formulate responses that are more suited to achieving long-term objectives, even if those responses have to be made in the face of uncertainty1, 2. Such a decision-analytic focus uses the products of climate science as probabilistic predictions about the effects of management policies3. Here we present methods to detect when climate predictions are failing to capture the system dynamics. For a single model, we measure goodness of fit based on the empirical distribution function, and define failure when the distribution of observed values significantly diverges from the modelled distribution. For a set of models, the same statistic can be used to provide relative weights for the individual models, and we define failure when there is no linear weighting of the ensemble models that produces a satisfactory match to the observations. Early detection of failure of a set of predictions is important for improving model predictions and the decisions based on them. We show that these methods would have detected a range shift in northern pintail 20 years before it was actually discovered, and are increasingly giving more weight to those climate models that forecast a September ice-free Arctic by 2055.
Aalto, Juha; Harrison, Stephan; Luoto, Miska
2017-09-11
The periglacial realm is a major part of the cryosphere, covering a quarter of Earth's land surface. Cryogenic land surface processes (LSPs) control landscape development, ecosystem functioning and climate through biogeochemical feedbacks, but their response to contemporary climate change is unclear. Here, by statistically modelling the current and future distributions of four major LSPs unique to periglacial regions at fine scale, we show fundamental changes in the periglacial climate realm are inevitable with future climate change. Even with the most optimistic CO 2 emissions scenario (Representative Concentration Pathway (RCP) 2.6) we predict a 72% reduction in the current periglacial climate realm by 2050 in our climatically sensitive northern Europe study area. These impacts are projected to be especially severe in high-latitude continental interiors. We further predict that by the end of the twenty-first century active periglacial LSPs will exist only at high elevations. These results forecast a future tipping point in the operation of cold-region LSP, and predict fundamental landscape-level modifications in ground conditions and related atmospheric feedbacks.Cryogenic land surface processes characterise the periglacial realm and control landscape development and ecosystem functioning. Here, via statistical modelling, the authors predict a 72% reduction of the periglacial realm in Northern Europe by 2050, and almost complete disappearance by 2100.
A system for learning statistical motion patterns.
Hu, Weiming; Xiao, Xuejuan; Fu, Zhouyu; Xie, Dan; Tan, Tieniu; Maybank, Steve
2006-09-01
Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy K-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction.
On Acoustic Source Specification for Rotor-Stator Interaction Noise Prediction
NASA Technical Reports Server (NTRS)
Nark, Douglas M.; Envia, Edmane; Burley, Caesy L.
2010-01-01
This paper describes the use of measured source data to assess the effects of acoustic source specification on rotor-stator interaction noise predictions. Specifically, the acoustic propagation and radiation portions of a recently developed coupled computational approach are used to predict tonal rotor-stator interaction noise from a benchmark configuration. In addition to the use of full measured data, randomization of source mode relative phases is also considered for specification of the acoustic source within the computational approach. Comparisons with sideline noise measurements are performed to investigate the effects of various source descriptions on both inlet and exhaust predictions. The inclusion of additional modal source content is shown to have a much greater influence on the inlet results. Reasonable agreement between predicted and measured levels is achieved for the inlet, as well as the exhaust when shear layer effects are taken into account. For the number of trials considered, phase randomized predictions follow statistical distributions similar to those found in previous statistical source investigations. The shape of the predicted directivity pattern relative to measurements also improved with phase randomization, having predicted levels generally within one standard deviation of the measured levels.
Statistical Maps of Ground Magnetic Disturbance Derived from Global Geospace Models
NASA Astrophysics Data System (ADS)
Rigler, E. J.; Wiltberger, M. J.; Love, J. J.
2017-12-01
Electric currents in space are the principal driver of magnetic variations measured at Earth's surface. These in turn induce geoelectric fields that present a natural hazard for technological systems like high-voltage power distribution networks. Modern global geospace models can reasonably simulate large-scale geomagnetic response to solar wind variations, but they are less successful at deterministic predictions of intense localized geomagnetic activity that most impacts technological systems on the ground. Still, recent studies have shown that these models can accurately reproduce the spatial statistical distributions of geomagnetic activity, suggesting that their physics are largely correct. Since the magnetosphere is a largely externally driven system, most model-measurement discrepancies probably arise from uncertain boundary conditions. So, with realistic distributions of solar wind parameters to establish its boundary conditions, we use the Lyon-Fedder-Mobarry (LFM) geospace model to build a synthetic multivariate statistical model of gridded ground magnetic disturbance. From this, we analyze the spatial modes of geomagnetic response, regress on available measurements to fill in unsampled locations on the grid, and estimate the global probability distribution of extreme magnetic disturbance. The latter offers a prototype geomagnetic "hazard map", similar to those used to characterize better-known geophysical hazards like earthquakes and floods.
NASA Astrophysics Data System (ADS)
Massoudieh, A.; Dentz, M.; Le Borgne, T.
2017-12-01
In heterogeneous media, the velocity distribution and the spatial correlation structure of velocity for solute particles determine the breakthrough curves and how they evolve as one moves away from the solute source. The ability to predict such evolution can help relating the spatio-statistical hydraulic properties of the media to the transport behavior and travel time distributions. While commonly used non-local transport models such as anomalous dispersion and classical continuous time random walk (CTRW) can reproduce breakthrough curve successfully by adjusting the model parameter values, they lack the ability to relate model parameters to the spatio-statistical properties of the media. This in turns limits the transferability of these models. In the research to be presented, we express concentration or flux of solutes as a distribution over their velocity. We then derive an integrodifferential equation that governs the evolution of the particle distribution over velocity at given times and locations for a particle ensemble, based on a presumed velocity correlation structure and an ergodic cross-sectional velocity distribution. This way, the spatial evolution of breakthrough curves away from the source is predicted based on cross-sectional velocity distribution and the connectivity, which is expressed by the velocity transition probability density. The transition probability is specified via a copula function that can help construct a joint distribution with a given correlation and given marginal velocities. Using this approach, we analyze the breakthrough curves depending on the velocity distribution and correlation properties. The model shows how the solute transport behavior evolves from ballistic transport at small spatial scales to Fickian dispersion at large length scales relative to the velocity correlation length.
Baldi, Pierre
2010-01-01
As repositories of chemical molecules continue to expand and become more open, it becomes increasingly important to develop tools to search them efficiently and assess the statistical significance of chemical similarity scores. Here we develop a general framework for understanding, modeling, predicting, and approximating the distribution of chemical similarity scores and its extreme values in large databases. The framework can be applied to different chemical representations and similarity measures but is demonstrated here using the most common binary fingerprints with the Tanimoto similarity measure. After introducing several probabilistic models of fingerprints, including the Conditional Gaussian Uniform model, we show that the distribution of Tanimoto scores can be approximated by the distribution of the ratio of two correlated Normal random variables associated with the corresponding unions and intersections. This remains true also when the distribution of similarity scores is conditioned on the size of the query molecules in order to derive more fine-grained results and improve chemical retrieval. The corresponding extreme value distributions for the maximum scores are approximated by Weibull distributions. From these various distributions and their analytical forms, Z-scores, E-values, and p-values are derived to assess the significance of similarity scores. In addition, the framework allows one to predict also the value of standard chemical retrieval metrics, such as Sensitivity and Specificity at fixed thresholds, or ROC (Receiver Operating Characteristic) curves at multiple thresholds, and to detect outliers in the form of atypical molecules. Numerous and diverse experiments carried in part with large sets of molecules from the ChemDB show remarkable agreement between theory and empirical results. PMID:20540577
A meta-analysis and statistical modelling of nitrates in groundwater at the African scale
NASA Astrophysics Data System (ADS)
Ouedraogo, Issoufou; Vanclooster, Marnik
2016-06-01
Contamination of groundwater with nitrate poses a major health risk to millions of people around Africa. Assessing the space-time distribution of this contamination, as well as understanding the factors that explain this contamination, is important for managing sustainable drinking water at the regional scale. This study aims to assess the variables that contribute to nitrate pollution in groundwater at the African scale by statistical modelling. We compiled a literature database of nitrate concentration in groundwater (around 250 studies) and combined it with digital maps of physical attributes such as soil, geology, climate, hydrogeology, and anthropogenic data for statistical model development. The maximum, medium, and minimum observed nitrate concentrations were analysed. In total, 13 explanatory variables were screened to explain observed nitrate pollution in groundwater. For the mean nitrate concentration, four variables are retained in the statistical explanatory model: (1) depth to groundwater (shallow groundwater, typically < 50 m); (2) recharge rate; (3) aquifer type; and (4) population density. The first three variables represent intrinsic vulnerability of groundwater systems to pollution, while the latter variable is a proxy for anthropogenic pollution pressure. The model explains 65 % of the variation of mean nitrate contamination in groundwater at the African scale. Using the same proxy information, we could develop a statistical model for the maximum nitrate concentrations that explains 42 % of the nitrate variation. For the maximum concentrations, other environmental attributes such as soil type, slope, rainfall, climate class, and region type improve the prediction of maximum nitrate concentrations at the African scale. As to minimal nitrate concentrations, in the absence of normal distribution assumptions of the data set, we do not develop a statistical model for these data. The data-based statistical model presented here represents an important step towards developing tools that will allow us to accurately predict nitrate distribution at the African scale and thus may support groundwater monitoring and water management that aims to protect groundwater systems. Yet they should be further refined and validated when more detailed and harmonized data become available and/or combined with more conceptual descriptions of the fate of nutrients in the hydrosystem.
Schulz, W.H.; Lidke, D.J.; Godt, J.W.
2008-01-01
Landslides in partially saturated colluvium on Seattle, WA, hillslopes have resulted in property damage and human casualties. We developed statistical models of colluvium and shallow-groundwater distributions to aid landslide hazard assessments. The models were developed using a geographic information system, digital geologic maps, digital topography, subsurface exploration results, the groundwater flow modeling software VS2DI and regression analyses. Input to the colluvium model includes slope, distance to a hillslope-crest escarpment, and escarpment slope and height. We developed different statistical relations for thickness of colluvium on four landforms. Groundwater model input includes colluvium basal slope and distance from the Fraser aquifer. This distance was used to estimate hydraulic conductivity based on the assumption that addition of finer-grained material from down-section would result in lower conductivity. Colluvial groundwater is perched so we estimated its saturated thickness. We used VS2DI to establish relations between saturated thickness and the hydraulic conductivity and basal slope of the colluvium. We developed different statistical relations for three groundwater flow regimes. All model results were validated using observational data that were excluded from calibration. Eighty percent of colluvium thickness predictions were within 25% of observed values and 88% of saturated thickness predictions were within 20% of observed values. The models are based on conditions common to many areas, so our method can provide accurate results for similar regions; relations in our statistical models require calibration for new regions. Our results suggest that Seattle landslides occur in native deposits and colluvium, ultimately in response to surface-water erosion of hillstope toes. Regional groundwater conditions do not appear to strongly affect the general distribution of Seattle landslides; historical landslides were equally dispersed within and outside of the area potentially affected by regional groundwater conditions.
Using System Dynamic Model and Neural Network Model to Analyse Water Scarcity in Sudan
NASA Astrophysics Data System (ADS)
Li, Y.; Tang, C.; Xu, L.; Ye, S.
2017-07-01
Many parts of the world are facing the problem of Water Scarcity. Analysing Water Scarcity quantitatively is an important step to solve the problem. Water scarcity in a region is gauged by WSI (water scarcity index), which incorporate water supply and water demand. To get the WSI, Neural Network Model and SDM (System Dynamic Model) that depict how environmental and social factors affect water supply and demand are developed to depict how environmental and social factors affect water supply and demand. The uneven distribution of water resource and water demand across a region leads to an uneven distribution of WSI within this region. To predict WSI for the future, logistic model, Grey Prediction, and statistics are applied in predicting variables. Sudan suffers from severe water scarcity problem with WSI of 1 in 2014, water resource unevenly distributed. According to the result of modified model, after the intervention, Sudan’s water situation will become better.
Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings
Yan, Xiaoyong; Minnhagen, Petter
2015-01-01
The word-frequency distribution of a text written by an author is well accounted for by a maximum entropy distribution, the RGF (random group formation)-prediction. The RGF-distribution is completely determined by the a priori values of the total number of words in the text (M), the number of distinct words (N) and the number of repetitions of the most common word (kmax). It is here shown that this maximum entropy prediction also describes a text written in Chinese characters. In particular it is shown that although the same Chinese text written in words and Chinese characters have quite differently shaped distributions, they are nevertheless both well predicted by their respective three a priori characteristic values. It is pointed out that this is analogous to the change in the shape of the distribution when translating a given text to another language. Another consequence of the RGF-prediction is that taking a part of a long text will change the input parameters (M, N, kmax) and consequently also the shape of the frequency distribution. This is explicitly confirmed for texts written in Chinese characters. Since the RGF-prediction has no system-specific information beyond the three a priori values (M, N, kmax), any specific language characteristic has to be sought in systematic deviations from the RGF-prediction and the measured frequencies. One such systematic deviation is identified and, through a statistical information theoretical argument and an extended RGF-model, it is proposed that this deviation is caused by multiple meanings of Chinese characters. The effect is stronger for Chinese characters than for Chinese words. The relation between Zipf’s law, the Simon-model for texts and the present results are discussed. PMID:25955175
Financial Stylized Facts in the Word of Mouth Model
NASA Astrophysics Data System (ADS)
Misawa, Tadanobu; Watanabe, Kyoko; Shimokawa, Tetsuya
Recently, we proposed an agent-based model called the word of mouth model to analyze the influence of an information transmission process to price formation in financial markets. Especially, the short-term predictability of asset return was focused on and an explanation in the view of information transmission was provided to the question why the predictability was much clearly observed in the small-sized stocks. This paper, to extend the previous study, demonstrates that the word of mouth model also has a consistency with other important financial stylized facts. This strengthens the possibility that the information transmission among investors plays a crucial role in price formation. Concretely, this paper addresses two famous statistical features of returns; the leptokurtic distribution of return and the autocorrelation of return volatility. The reasons why these statistical facts receive especial attentions of researchers among financial stylized facts are their statistical robustness and practical importance, such as the applications to the derivative pricing problems.
NASA Astrophysics Data System (ADS)
Kovalevsky, Louis; Langley, Robin S.; Caro, Stephane
2016-05-01
Due to the high cost of experimental EMI measurements significant attention has been focused on numerical simulation. Classical methods such as Method of Moment or Finite Difference Time Domain are not well suited for this type of problem, as they require a fine discretisation of space and failed to take into account uncertainties. In this paper, the authors show that the Statistical Energy Analysis is well suited for this type of application. The SEA is a statistical approach employed to solve high frequency problems of electromagnetically reverberant cavities at a reduced computational cost. The key aspects of this approach are (i) to consider an ensemble of system that share the same gross parameter, and (ii) to avoid solving Maxwell's equations inside the cavity, using the power balance principle. The output is an estimate of the field magnitude distribution in each cavity. The method is applied on a typical aircraft structure.
Parameter Estimation in Astronomy with Poisson-Distributed Data. 1; The (CHI)2(gamma) Statistic
NASA Technical Reports Server (NTRS)
Mighell, Kenneth J.
1999-01-01
Applying the standard weighted mean formula, [Sigma (sub i)n(sub i)ssigma(sub i, sup -2)], to determine the weighted mean of data, n(sub i), drawn from a Poisson distribution, will, on average, underestimate the true mean by approx. 1 for all true mean values larger than approx.3 when the common assumption is made that the error of the i th observation is sigma(sub i) = max square root of n(sub i), 1).This small, but statistically significant offset, explains the long-known observation that chi-square minimization techniques which use the modified Neyman'chi(sub 2) statistic, chi(sup 2, sub N) equivalent Sigma(sub i)((n(sub i) - y(sub i)(exp 2)) / max(n(sub i), 1), to compare Poisson - distributed data with model values, y(sub i), will typically predict a total number of counts that underestimates the true total by about 1 count per bin. Based on my finding that weighted mean of data drawn from a Poisson distribution can be determined using the formula [Sigma(sub i)[n(sub i) + min(n(sub i), 1)](n(sub i) + 1)(exp -1)] / [Sigma(sub i)(n(sub i) + 1)(exp -1))], I propose that a new chi(sub 2) statistic, chi(sup 2, sub gamma) equivalent, should always be used to analyze Poisson- distributed data in preference to the modified Neyman's chi(exp 2) statistic. I demonstrated the power and usefulness of,chi(sub gamma, sup 2) minimization by using two statistical fitting techniques and five chi(exp 2) statistics to analyze simulated X-ray power - low 15 - channel spectra with large and small counts per bin. I show that chi(sub gamma, sup 2) minimization with the Levenberg - Marquardt or Powell's method can produce excellent results (mean slope errors approx. less than 3%) with spectra having as few as 25 total counts.
NASA Astrophysics Data System (ADS)
Luke, Adam; Vrugt, Jasper A.; AghaKouchak, Amir; Matthew, Richard; Sanders, Brett F.
2017-07-01
Nonstationary extreme value analysis (NEVA) can improve the statistical representation of observed flood peak distributions compared to stationary (ST) analysis, but management of flood risk relies on predictions of out-of-sample distributions for which NEVA has not been comprehensively evaluated. In this study, we apply split-sample testing to 1250 annual maximum discharge records in the United States and compare the predictive capabilities of NEVA relative to ST extreme value analysis using a log-Pearson Type III (LPIII) distribution. The parameters of the LPIII distribution in the ST and nonstationary (NS) models are estimated from the first half of each record using Bayesian inference. The second half of each record is reserved to evaluate the predictions under the ST and NS models. The NS model is applied for prediction by (1) extrapolating the trend of the NS model parameters throughout the evaluation period and (2) using the NS model parameter values at the end of the fitting period to predict with an updated ST model (uST). Our analysis shows that the ST predictions are preferred, overall. NS model parameter extrapolation is rarely preferred. However, if fitting period discharges are influenced by physical changes in the watershed, for example from anthropogenic activity, the uST model is strongly preferred relative to ST and NS predictions. The uST model is therefore recommended for evaluation of current flood risk in watersheds that have undergone physical changes. Supporting information includes a MATLAB® program that estimates the (ST/NS/uST) LPIII parameters from annual peak discharge data through Bayesian inference.
Distributional Effects and Individual Differences in L2 Morphology Learning
ERIC Educational Resources Information Center
Brooks, Patricia J.; Kwoka, Nicole; Kempe, Vera
2017-01-01
Second language (L2) learning outcomes may depend on the structure of the input and learners' cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and…
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Jet Noise Diagnostics Supporting Statistical Noise Prediction Methods
NASA Technical Reports Server (NTRS)
Bridges, James E.
2006-01-01
The primary focus of my presentation is the development of the jet noise prediction code JeNo with most examples coming from the experimental work that drove the theoretical development and validation. JeNo is a statistical jet noise prediction code, based upon the Lilley acoustic analogy. Our approach uses time-average 2-D or 3-D mean and turbulent statistics of the flow as input. The output is source distributions and spectral directivity. NASA has been investing in development of statistical jet noise prediction tools because these seem to fit the middle ground that allows enough flexibility and fidelity for jet noise source diagnostics while having reasonable computational requirements. These tools rely on Reynolds-averaged Navier-Stokes (RANS) computational fluid dynamics (CFD) solutions as input for computing far-field spectral directivity using an acoustic analogy. There are many ways acoustic analogies can be created, each with a series of assumptions and models, many often taken unknowingly. And the resulting prediction can be easily reverse-engineered by altering the models contained within. However, only an approach which is mathematically sound, with assumptions validated and modeled quantities checked against direct measurement will give consistently correct answers. Many quantities are modeled in acoustic analogies precisely because they have been impossible to measure or calculate, making this requirement a difficult task. The NASA team has spent considerable effort identifying all the assumptions and models used to take the Navier-Stokes equations to the point of a statistical calculation via an acoustic analogy very similar to that proposed by Lilley. Assumptions have been identified and experiments have been developed to test these assumptions. In some cases this has resulted in assumptions being changed. Beginning with the CFD used as input to the acoustic analogy, models for turbulence closure used in RANS CFD codes have been explored and compared against measurements of mean and rms velocity statistics over a range of jet speeds and temperatures. Models for flow parameters used in the acoustic analogy, most notably the space-time correlations of velocity, have been compared against direct measurements, and modified to better fit the observed data. These measurements have been extremely challenging for hot, high speed jets, and represent a sizeable investment in instrumentation development. As an intermediate check that the analysis is predicting the physics intended, phased arrays have been employed to measure source distributions for a wide range of jet cases. And finally, careful far-field spectral directivity measurements have been taken for final validation of the prediction code. Examples of each of these experimental efforts will be presented. The main result of these efforts is a noise prediction code, named JeNo, which is in middevelopment. JeNo is able to consistently predict spectral directivity, including aft angle directivity, for subsonic cold jets of most geometries. Current development on JeNo is focused on extending its capability to hot jets, requiring inclusion of a previously neglected second source associated with thermal fluctuations. A secondary result of the intensive experimentation is the archiving of various flow statistics applicable to other acoustic analogies and to development of time-resolved prediction methods. These will be of lasting value as we look ahead at future challenges to the aeroacoustic experimentalist.
Inflationary tensor fossils in large-scale structure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dimastrogiovanni, Emanuela; Fasiello, Matteo; Jeong, Donghui
Inflation models make specific predictions for a tensor-scalar-scalar three-point correlation, or bispectrum, between one gravitational-wave (tensor) mode and two density-perturbation (scalar) modes. This tensor-scalar-scalar correlation leads to a local power quadrupole, an apparent departure from statistical isotropy in our Universe, as well as characteristic four-point correlations in the current mass distribution in the Universe. So far, the predictions for these observables have been worked out only for single-clock models in which certain consistency conditions between the tensor-scalar-scalar correlation and tensor and scalar power spectra are satisfied. Here we review the requirements on inflation models for these consistency conditions to bemore » satisfied. We then consider several examples of inflation models, such as non-attractor and solid-inflation models, in which these conditions are put to the test. In solid inflation the simplest consistency conditions are already violated whilst in the non-attractor model we find that, contrary to the standard scenario, the tensor-scalar-scalar correlator probes directly relevant model-dependent information. We work out the predictions for observables in these models. For non-attractor inflation we find an apparent local quadrupolar departure from statistical isotropy in large-scale structure but that this power quadrupole decreases very rapidly at smaller scales. The consistency of the CMB quadrupole with statistical isotropy then constrains the distance scale that corresponds to the transition from the non-attractor to attractor phase of inflation to be larger than the currently observable horizon. Solid inflation predicts clustering fossils signatures in the current galaxy distribution that may be large enough to be detectable with forthcoming, and possibly even current, galaxy surveys.« less
Vasilyev, M; Choi, S K; Kumar, P; D'Ariano, G M
1998-09-01
Photon-number distributions for parametric fluorescence from a nondegenerate optical parametric amplifier are measured with a novel self-homodyne technique. These distributions exhibit the thermal-state character predicted by theory. However, a difference between the fluorescence gain and the signal gain of the parametric amplifier is observed. We attribute this difference to a change in the signal-beam profile during the traveling-wave pulsed amplification process.
NASA Astrophysics Data System (ADS)
Baral, P.; Haq, M. A.; Mangan, P.
2017-12-01
The impacts of climate change on extent of permafrost degradation in the Himalayas and its effect upon the carbon cycle and ecosystem changes are not well understood due to lack of historical ground-based observations. We have used high resolution optical and satellite radar observations and applied empirical-statistical methods for the estimation of spatial and altitudinal limits of permafrost distribution in North-Western Himalayas. Visual interpretations of morphological characteristics using high resolution optical images have been used for mapping, identification and classification of distinctive geomorphological landforms. Subsequently, we have created a detail inventory of different types of rock glaciers and studied the contribution of topo climatic factors in their occurrence and distribution through Logistic Regression modelling. This model establishes the relationship between presence of permafrost and topo-climatic factors like Mean Annual Air Temperature (MAAT), Potential Incoming Solar Radiation (PISR), altitude, aspect and slope. This relationship has been used to estimate the distributed probability of permafrost occurrence, within a GIS environment. The ability of the model to predict permafrost occurrence has been tested using locations of mapped rock glaciers and the area under the Receiver Operating Characteristic (ROC) curve. Additionally, interferometric properties of Sentinel and ALOS PALSAR datasets are used for the identification and assessment of rock glacier activity in the region.
NASA Astrophysics Data System (ADS)
Krumholz, Mark R.; Ting, Yuan-Sen
2018-04-01
The distributions of a galaxy's gas and stars in chemical space encode a tremendous amount of information about that galaxy's physical properties and assembly history. However, present methods for extracting information from chemical distributions are based either on coarse averages measured over galactic scales (e.g. metallicity gradients) or on searching for clusters in chemical space that can be identified with individual star clusters or gas clouds on ˜1 pc scales. These approaches discard most of the information, because in galaxies gas and young stars are observed to be distributed fractally, with correlations on all scales, and the same is likely to be true of metals. In this paper we introduce a first theoretical model, based on stochastically forced diffusion, capable of predicting the multiscale statistics of metal fields. We derive the variance, correlation function, and power spectrum of the metal distribution from first principles, and determine how these quantities depend on elements' astrophysical origin sites and on the large-scale properties of galaxies. Among other results, we explain for the first time why the typical abundance scatter observed in the interstellar media of nearby galaxies is ≈0.1 dex, and we predict that this scatter will be correlated on spatial scales of ˜0.5-1 kpc, and over time-scales of ˜100-300 Myr. We discuss the implications of our results for future chemical tagging studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Okabe, T.; Takeda, N.; Komotori, J.
1999-11-26
A new model is proposed for multiple matrix cracking in order to take into account the role of matrix-rich regions in the cross section in initiating crack growth. The model is used to predict the matrix cracking stress and the total number of matrix cracks. The model converts the matrix-rich regions into equivalent penny shape crack sizes and predicts the matrix cracking stress with a fracture mechanics crack-bridging model. The estimated distribution of matrix cracking stresses is used as statistical input to predict the number of matrix cracks. The results show good agreement with the experimental results by replica observations.more » Therefore, it is found that the matrix cracking behavior mainly depends on the distribution of matrix-rich regions in the composite.« less
Statistical self-similarity of width function maxima with implications to floods
Veitzer, S.A.; Gupta, V.K.
2001-01-01
Recently a new theory of random self-similar river networks, called the RSN model, was introduced to explain empirical observations regarding the scaling properties of distributions of various topologic and geometric variables in natural basins. The RSN model predicts that such variables exhibit statistical simple scaling, when indexed by Horton-Strahler order. The average side tributary structure of RSN networks also exhibits Tokunaga-type self-similarity which is widely observed in nature. We examine the scaling structure of distributions of the maximum of the width function for RSNs for nested, complete Strahler basins by performing ensemble simulations. The maximum of the width function exhibits distributional simple scaling, when indexed by Horton-Strahler order, for both RSNs and natural river networks extracted from digital elevation models (DEMs). We also test a powerlaw relationship between Horton ratios for the maximum of the width function and drainage areas. These results represent first steps in formulating a comprehensive physical statistical theory of floods at multiple space-time scales for RSNs as discrete hierarchical branching structures. ?? 2001 Published by Elsevier Science Ltd.
From entropy-maximization to equality-maximization: Gauss, Laplace, Pareto, and Subbotin
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2014-12-01
The entropy-maximization paradigm of statistical physics is well known to generate the omnipresent Gauss law. In this paper we establish an analogous socioeconomic model which maximizes social equality, rather than physical disorder, in the context of the distributions of income and wealth in human societies. We show that-on a logarithmic scale-the Laplace law is the socioeconomic equality-maximizing counterpart of the physical entropy-maximizing Gauss law, and that this law manifests an optimized balance between two opposing forces: (i) the rich and powerful, striving to amass ever more wealth, and thus to increase social inequality; and (ii) the masses, struggling to form more egalitarian societies, and thus to increase social equality. Our results lead from log-Gauss statistics to log-Laplace statistics, yield Paretian power-law tails of income and wealth distributions, and show how the emergence of a middle-class depends on the underlying levels of socioeconomic inequality and variability. Also, in the context of asset-prices with Laplace-distributed returns, our results imply that financial markets generate an optimized balance between risk and predictability.
NASA Astrophysics Data System (ADS)
Alahmadi, F.; Rahman, N. A.; Abdulrazzak, M.
2014-09-01
Rainfall frequency analysis is an essential tool for the design of water related infrastructure. It can be used to predict future flood magnitudes for a given magnitude and frequency of extreme rainfall events. This study analyses the application of rainfall partial duration series (PDS) in the vast growing urban Madinah city located in the western part of Saudi Arabia. Different statistical distributions were applied (i.e. Normal, Log Normal, Extreme Value type I, Generalized Extreme Value, Pearson Type III, Log Pearson Type III) and their distribution parameters were estimated using L-moments methods. Also, different selection criteria models are applied, e.g. Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (AICc), Bayesian Information Criterion (BIC) and Anderson-Darling Criterion (ADC). The analysis indicated the advantage of Generalized Extreme Value as the best fit statistical distribution for Madinah partial duration daily rainfall series. The outcome of such an evaluation can contribute toward better design criteria for flood management, especially flood protection measures.
Modeling and forecasting the distribution of Vibrio vulnificus in Chesapeake Bay.
Jacobs, J M; Rhodes, M; Brown, C W; Hood, R R; Leight, A; Long, W; Wood, R
2014-11-01
To construct statistical models to predict the presence, abundance and potential virulence of Vibrio vulnificus in surface waters of Chesapeake Bay for implementation in ecological forecasting systems. We evaluated and applied previously published qPCR assays to water samples (n = 1636) collected from Chesapeake Bay from 2007-2010 in conjunction with State water quality monitoring programmes. A variety of statistical techniques were used in concert to identify water quality parameters associated with V. vulnificus presence, abundance and virulence markers in the interest of developing strong predictive models for use in regional oceanographic modeling systems. A suite of models are provided to represent the best model fit and alternatives using environmental variables that allow them to be put to immediate use in current ecological forecasting efforts. Environmental parameters such as temperature, salinity and turbidity are capable of accurately predicting abundance and distribution of V. vulnificus in Chesapeake Bay. Forcing these empirical models with output from ocean modeling systems allows for spatially explicit forecasts for up to 48 h in the future. This study uses one of the largest data sets compiled to model Vibrio in an estuary, enhances our understanding of environmental correlates with abundance, distribution and presence of potentially virulent strains and offers a method to forecast these pathogens that may be replicated in other regions. This article has been contributed to by US Government employees and their work is in the public domain in the USA.
Texture metric that predicts target detection performance
NASA Astrophysics Data System (ADS)
Culpepper, Joanne B.
2015-12-01
Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.
A statistical model of brittle fracture by transgranular cleavage
NASA Astrophysics Data System (ADS)
Lin, Tsann; Evans, A. G.; Ritchie, R. O.
A MODEL for brittle fracture by transgranular cleavage cracking is presented based on the application of weakest link statistics to the critical microstructural fracture mechanisms. The model permits prediction of the macroscopic fracture toughness, KI c, in single phase microstructures containing a known distribution of particles, and defines the critical distance from the crack tip at which the initial cracking event is most probable. The model is developed for unstable fracture ahead of a sharp crack considering both linear elastic and nonlinear elastic ("elastic/plastic") crack tip stress fields. Predictions are evaluated by comparison with experimental results on the low temperature flow and fracture behavior of a low carbon mild steel with a simple ferrite/grain boundary carbide microstructure.
Towards Principled Experimental Study of Autonomous Mobile Robots
NASA Technical Reports Server (NTRS)
Gat, Erann
1995-01-01
We review the current state of research in autonomous mobile robots and conclude that there is an inadequate basis for predicting the reliability and behavior of robots operating in unengineered environments. We present a new approach to the study of autonomous mobile robot performance based on formal statistical analysis of independently reproducible experiments conducted on real robots. Simulators serve as models rather than experimental surrogates. We demonstrate three new results: 1) Two commonly used performance metrics (time and distance) are not as well correlated as is often tacitly assumed. 2) The probability distributions of these performance metrics are exponential rather than normal, and 3) a modular, object-oriented simulation accurately predicts the behavior of the real robot in a statistically significant manner.
Local backbone structure prediction of proteins
De Brevern, Alexandre G.; Benros, Cristina; Gautier, Romain; Valadié, Hélène; Hazout, Serge; Etchebest, Catherine
2004-01-01
Summary A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (φ, Ψ) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences. LocPred is a software which allows the users to submit a protein sequence and performs a prediction in terms of PBs. The prediction results are given both textually and graphically. PMID:15724288
NASA Astrophysics Data System (ADS)
Boyd-Lee, Ashley; King, Julia
1992-07-01
A discrete statistical model of fatigue crack growth in a nickel base superalloy Waspaloy, which is quantitative from the start of the short crack regime to failure, is presented. Instantaneous crack growth rate distributions and persistence of arrest distributions are used to compute fatigue lives and worst case scenarios without extrapolation. The basis of the model is non-material specific, it provides an improved method of analyzing crack growth rate data. For Waspaloy, the model shows the importance of good bulk fatigue crack growth resistance to resist early short fatigue crack growth and the importance of maximizing crack arrest both by the presence of a proportion of small grains and by maximizing grain boundary corrugation.
NASA Astrophysics Data System (ADS)
Wu, Y.; Chen, G. L.; Hui, X. D.; Liu, C. T.; Lin, Y.; Shang, X. C.; Lu, Z. P.
2009-10-01
Based on mechanical instability of individual shear transformation zones (STZs), a quantitative link between the microplastic instability and macroscopic deformation behavior of metallic glasses was proposed. Our analysis confirms that macroscopic metallic glasses comprise a statistical distribution of STZ embryos with distributed values of activation energy, and the microplastic instability of all the individual STZs dictates the macroscopic deformation behavior of amorphous solids. The statistical model presented in this paper can successfully reproduce the macroscopic stress-strain curves determined experimentally and readily be used to predict strain-rate effects on the macroscopic responses with the availability of the material parameters at a certain strain rate, which offer new insights into understanding the actual deformation mechanism in amorphous solids.
Science and Facebook: The same popularity law!
Néda, Zoltán; Varga, Levente; Biró, Tamás S
2017-01-01
The distribution of scientific citations for publications selected with different rules (author, topic, institution, country, journal, etc…) collapse on a single curve if one plots the citations relative to their mean value. We find that the distribution of "shares" for the Facebook posts rescale in the same manner to the very same curve with scientific citations. This finding suggests that citations are subjected to the same growth mechanism with Facebook popularity measures, being influenced by a statistically similar social environment and selection mechanism. In a simple master-equation approach the exponential growth of the number of publications and a preferential selection mechanism leads to a Tsallis-Pareto distribution offering an excellent description for the observed statistics. Based on our model and on the data derived from PubMed we predict that according to the present trend the average citations per scientific publications exponentially relaxes to about 4.
Statistical Modeling of Robotic Random Walks on Different Terrain
NASA Astrophysics Data System (ADS)
Naylor, Austin; Kinnaman, Laura
Issues of public safety, especially with crowd dynamics and pedestrian movement, have been modeled by physicists using methods from statistical mechanics over the last few years. Complex decision making of humans moving on different terrains can be modeled using random walks (RW) and correlated random walks (CRW). The effect of different terrains, such as a constant increasing slope, on RW and CRW was explored. LEGO robots were programmed to make RW and CRW with uniform step sizes. Level ground tests demonstrated that the robots had the expected step size distribution and correlation angles (for CRW). The mean square displacement was calculated for each RW and CRW on different terrains and matched expected trends. The step size distribution was determined to change based on the terrain; theoretical predictions for the step size distribution were made for various simple terrains. It's Dr. Laura Kinnaman, not sure where to put the Prefix.
Algorithm of probabilistic assessment of fully-mechanized longwall downtime
NASA Astrophysics Data System (ADS)
Domrachev, A. N.; Rib, S. V.; Govorukhin, Yu M.; Krivopalov, V. G.
2017-09-01
The problem of increasing the load on a long fully-mechanized longwall has several aspects, one of which is the improvement of efficiency in using available stoping equipment due to the increase in coefficient of the machine operating time of a shearer and other mining machines that form an integral part of the longwall set of equipment. The task of predicting the reliability indicators of stoping equipment is solved by the statistical evaluation of parameters of downtime exponential distribution and failure recovery. It is more difficult to solve the problems of downtime accounting in case of accidents in the face workings and, despite the statistical data on accidents in mine workings, no solution has been found to date. The authors have proposed a variant of probability assessment of workings caving using Poisson distribution and the duration of their restoration using normal distribution. The above results confirm the possibility of implementing the approach proposed by the authors.
Science and Facebook: The same popularity law!
Varga, Levente; Biró, Tamás S.
2017-01-01
The distribution of scientific citations for publications selected with different rules (author, topic, institution, country, journal, etc…) collapse on a single curve if one plots the citations relative to their mean value. We find that the distribution of “shares” for the Facebook posts rescale in the same manner to the very same curve with scientific citations. This finding suggests that citations are subjected to the same growth mechanism with Facebook popularity measures, being influenced by a statistically similar social environment and selection mechanism. In a simple master-equation approach the exponential growth of the number of publications and a preferential selection mechanism leads to a Tsallis-Pareto distribution offering an excellent description for the observed statistics. Based on our model and on the data derived from PubMed we predict that according to the present trend the average citations per scientific publications exponentially relaxes to about 4. PMID:28678796
Universal Quake Statistics: From Compressed Nanocrystals to Earthquakes.
Uhl, Jonathan T; Pathak, Shivesh; Schorlemmer, Danijel; Liu, Xin; Swindeman, Ryan; Brinkman, Braden A W; LeBlanc, Michael; Tsekenis, Georgios; Friedman, Nir; Behringer, Robert; Denisov, Dmitry; Schall, Peter; Gu, Xiaojun; Wright, Wendelin J; Hufnagel, Todd; Jennings, Andrew; Greer, Julia R; Liaw, P K; Becker, Thorsten; Dresen, Georg; Dahmen, Karin A
2015-11-17
Slowly-compressed single crystals, bulk metallic glasses (BMGs), rocks, granular materials, and the earth all deform via intermittent slips or "quakes". We find that although these systems span 12 decades in length scale, they all show the same scaling behavior for their slip size distributions and other statistical properties. Remarkably, the size distributions follow the same power law multiplied with the same exponential cutoff. The cutoff grows with applied force for materials spanning length scales from nanometers to kilometers. The tuneability of the cutoff with stress reflects "tuned critical" behavior, rather than self-organized criticality (SOC), which would imply stress-independence. A simple mean field model for avalanches of slipping weak spots explains the agreement across scales. It predicts the observed slip-size distributions and the observed stress-dependent cutoff function. The results enable extrapolations from one scale to another, and from one force to another, across different materials and structures, from nanocrystals to earthquakes.
Universal Quake Statistics: From Compressed Nanocrystals to Earthquakes
Uhl, Jonathan T.; Pathak, Shivesh; Schorlemmer, Danijel; Liu, Xin; Swindeman, Ryan; Brinkman, Braden A. W.; LeBlanc, Michael; Tsekenis, Georgios; Friedman, Nir; Behringer, Robert; Denisov, Dmitry; Schall, Peter; Gu, Xiaojun; Wright, Wendelin J.; Hufnagel, Todd; Jennings, Andrew; Greer, Julia R.; Liaw, P. K.; Becker, Thorsten; Dresen, Georg; Dahmen, Karin A.
2015-01-01
Slowly-compressed single crystals, bulk metallic glasses (BMGs), rocks, granular materials, and the earth all deform via intermittent slips or “quakes”. We find that although these systems span 12 decades in length scale, they all show the same scaling behavior for their slip size distributions and other statistical properties. Remarkably, the size distributions follow the same power law multiplied with the same exponential cutoff. The cutoff grows with applied force for materials spanning length scales from nanometers to kilometers. The tuneability of the cutoff with stress reflects “tuned critical” behavior, rather than self-organized criticality (SOC), which would imply stress-independence. A simple mean field model for avalanches of slipping weak spots explains the agreement across scales. It predicts the observed slip-size distributions and the observed stress-dependent cutoff function. The results enable extrapolations from one scale to another, and from one force to another, across different materials and structures, from nanocrystals to earthquakes. PMID:26572103
On the statistical properties of viral misinformation in online social media
NASA Astrophysics Data System (ADS)
Bessi, Alessandro
2017-03-01
The massive diffusion of online social media allows for the rapid and uncontrolled spreading of conspiracy theories, hoaxes, unsubstantiated claims, and false news. Such an impressive amount of misinformation can influence policy preferences and encourage behaviors strongly divergent from recommended practices. In this paper, we study the statistical properties of viral misinformation in online social media. By means of methods belonging to Extreme Value Theory, we show that the number of extremely viral posts over time follows a homogeneous Poisson process, and that the interarrival times between such posts are independent and identically distributed, following an exponential distribution. Moreover, we characterize the uncertainty around the rate parameter of the Poisson process through Bayesian methods. Finally, we are able to derive the predictive posterior probability distribution of the number of posts exceeding a certain threshold of shares over a finite interval of time.
NASA Astrophysics Data System (ADS)
Spampinato, A.; Axinte, D. A.
2017-12-01
The mechanisms of interaction between bodies with statistically arranged features present characteristics common to different abrasive processes, such as dressing of abrasive tools. In contrast with the current empirical approach used to estimate the results of operations based on attritive interactions, the method we present in this paper allows us to predict the output forces and the topography of a simulated grinding wheel for a set of specific operational parameters (speed ratio and radial feed-rate), providing a thorough understanding of the complex mechanisms regulating these processes. In modelling the dressing mechanisms, the abrasive characteristics of both bodies (grain size, geometry, inter-space and protrusion) are first simulated; thus, their interaction is simulated in terms of grain collisions. Exploiting a specifically designed contact/impact evaluation algorithm, the model simulates the collisional effects of the dresser abrasives on the grinding wheel topography (grain fracture/break-out). The method has been tested for the case of a diamond rotary dresser, predicting output forces within less than 10% error and obtaining experimentally validated grinding wheel topographies. The study provides a fundamental understanding of the dressing operation, enabling the improvement of its performance in an industrial scenario, while being of general interest in modelling collision-based processes involving statistically distributed elements.
Predicting the geographical distribution of two invasive termite species from occurrence data.
Tonini, Francesco; Divino, Fabio; Lasinio, Giovanna Jona; Hochmair, Hartwig H; Scheffrahn, Rudolf H
2014-10-01
Predicting the potential habitat of species under both current and future climate change scenarios is crucial for monitoring invasive species and understanding a species' response to different environmental conditions. Frequently, the only data available on a species is the location of its occurrence (presence-only data). Using occurrence records only, two models were used to predict the geographical distribution of two destructive invasive termite species, Coptotermes gestroi (Wasmann) and Coptotermes formosanus Shiraki. The first model uses a Bayesian linear logistic regression approach adjusted for presence-only data while the second one is the widely used maximum entropy approach (Maxent). Results show that the predicted distributions of both C. gestroi and C. formosanus are strongly linked to urban development. The impact of future scenarios such as climate warming and population growth on the biotic distribution of both termite species was also assessed. Future climate warming seems to affect their projected probability of presence to a lesser extent than population growth. The Bayesian logistic approach outperformed Maxent consistently in all models according to evaluation criteria such as model sensitivity and ecological realism. The importance of further studies for an explicit treatment of residual spatial autocorrelation and a more comprehensive comparison between both statistical approaches is suggested.
SGR-like behaviour of the repeating FRB 121102
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, F.Y.; Yu, H., E-mail: fayinwang@nju.edu.cn, E-mail: yuhai@smail.nju.edu.cn
2017-03-01
Fast radio bursts (FRBs) are millisecond-duration radio signals occurring at cosmological distances. However the physical model of FRBs is mystery, many models have been proposed. Here we study the frequency distributions of peak flux, fluence, duration and waiting time for the repeating FRB 121102. The cumulative distributions of peak flux, fluence and duration show power-law forms. The waiting time distribution also shows power-law distribution, and is consistent with a non-stationary Poisson process. These distributions are similar as those of soft gamma repeaters (SGRs). We also use the statistical results to test the proposed models for FRBs. These distributions are consistentmore » with the predictions from avalanche models of slowly driven nonlinear dissipative systems.« less
An order statistics approach to the halo model for galaxies
NASA Astrophysics Data System (ADS)
Paul, Niladri; Paranjape, Aseem; Sheth, Ravi K.
2017-04-01
We use the halo model to explore the implications of assuming that galaxy luminosities in groups are randomly drawn from an underlying luminosity function. We show that even the simplest of such order statistics models - one in which this luminosity function p(L) is universal - naturally produces a number of features associated with previous analyses based on the 'central plus Poisson satellites' hypothesis. These include the monotonic relation of mean central luminosity with halo mass, the lognormal distribution around this mean and the tight relation between the central and satellite mass scales. In stark contrast to observations of galaxy clustering; however, this model predicts no luminosity dependence of large-scale clustering. We then show that an extended version of this model, based on the order statistics of a halo mass dependent luminosity function p(L|m), is in much better agreement with the clustering data as well as satellite luminosities, but systematically underpredicts central luminosities. This brings into focus the idea that central galaxies constitute a distinct population that is affected by different physical processes than are the satellites. We model this physical difference as a statistical brightening of the central luminosities, over and above the order statistics prediction. The magnitude gap between the brightest and second brightest group galaxy is predicted as a by-product, and is also in good agreement with observations. We propose that this order statistics framework provides a useful language in which to compare the halo model for galaxies with more physically motivated galaxy formation models.
Puc, Małgorzata; Wolski, Tomasz
2013-01-01
The allergenic pollen content of the atmosphere varies according to climate, biogeography and vegetation. Minimisation of the pollen allergy symptoms is related to the possibility of avoidance of large doses of the allergen. Measurements performed in Szczecin over a period of 13 years (2000-2012 inclusive) permitted prediction of theoretical maximum concentrations of pollen grains and their probability for the pollen season of Poaceae, Artemisia and Ambrosia. Moreover, the probabilities were determined of a given date as the beginning of the pollen season, the date of the maximum pollen count, Seasonal Pollen Index value and the number of days with pollen count above threshold values. Aerobiological monitoring was conducted using a Hirst volumetric trap (Lanzoni VPPS). Linear trend with determination coefficient (R(2)) was calculated. Model for long-term forecasting was performed by the method based on Gumbel's distribution. A statistically significant negative correlation was determined between the duration of pollen season of Poaceae and Artemisia and the Seasonal Pollen Index value. Seasonal, total pollen counts of Artemisia and Ambrosia showed a strong and statistically significant decreasing tendency. On the basis of Gumbel's distribution, a model was proposed for Szczecin, allowing prediction of the probabilities of the maximum pollen count values that can appear once in e.g. 5, 10 or 100 years. Short pollen seasons are characterised by a higher intensity of pollination than long ones. Prediction of the maximum pollen count values, dates of the pollen season beginning, and the number of days with pollen count above the threshold, on the basis of Gumbel's distribution, is expected to lead to improvement in the prophylaxis and therapy of persons allergic to pollen.
Why significant variables aren't automatically good predictors.
Lo, Adeline; Chernoff, Herman; Zheng, Tian; Lo, Shaw-Hwa
2015-11-10
Thus far, genome-wide association studies (GWAS) have been disappointing in the inability of investigators to use the results of identified, statistically significant variants in complex diseases to make predictions useful for personalized medicine. Why are significant variables not leading to good prediction of outcomes? We point out that this problem is prevalent in simple as well as complex data, in the sciences as well as the social sciences. We offer a brief explanation and some statistical insights on why higher significance cannot automatically imply stronger predictivity and illustrate through simulations and a real breast cancer example. We also demonstrate that highly predictive variables do not necessarily appear as highly significant, thus evading the researcher using significance-based methods. We point out that what makes variables good for prediction versus significance depends on different properties of the underlying distributions. If prediction is the goal, we must lay aside significance as the only selection standard. We suggest that progress in prediction requires efforts toward a new research agenda of searching for a novel criterion to retrieve highly predictive variables rather than highly significant variables. We offer an alternative approach that was not designed for significance, the partition retention method, which was very effective predicting on a long-studied breast cancer data set, by reducing the classification error rate from 30% to 8%.
The non-statistical dynamics of the 18O + 32O2 isotope exchange reaction at two energies
NASA Astrophysics Data System (ADS)
Van Wyngarden, Annalise L.; Mar, Kathleen A.; Quach, Jim; Nguyen, Anh P. Q.; Wiegel, Aaron A.; Lin, Shi-Ying; Lendvay, Gyorgy; Guo, Hua; Lin, Jim J.; Lee, Yuan T.; Boering, Kristie A.
2014-08-01
The dynamics of the 18O(3P) + 32O2 isotope exchange reaction were studied using crossed atomic and molecular beams at collision energies (Ecoll) of 5.7 and 7.3 kcal/mol, and experimental results were compared with quantum statistical (QS) and quasi-classical trajectory (QCT) calculations on the O3(X1A') potential energy surface (PES) of Babikov et al. [D. Babikov, B. K. Kendrick, R. B. Walker, R. T. Pack, P. Fleurat-Lesard, and R. Schinke, J. Chem. Phys. 118, 6298 (2003)]. In both QS and QCT calculations, agreement with experiment was markedly improved by performing calculations with the experimental distribution of collision energies instead of fixed at the average collision energy. At both collision energies, the scattering displayed a forward bias, with a smaller bias at the lower Ecoll. Comparisons with the QS calculations suggest that 34O2 is produced with a non-statistical rovibrational distribution that is hotter than predicted, and the discrepancy is larger at the lower Ecoll. If this underprediction of rovibrational excitation by the QS method is not due to PES errors and/or to non-adiabatic effects not included in the calculations, then this collision energy dependence is opposite to what might be expected based on collision complex lifetime arguments and opposite to that measured for the forward bias. While the QCT calculations captured the experimental product vibrational energy distribution better than the QS method, the QCT results underpredicted rotationally excited products, overpredicted forward-bias and predicted a trend in the strength of forward-bias with collision energy opposite to that measured, indicating that it does not completely capture the dynamic behavior measured in the experiment. Thus, these results further underscore the need for improvement in theoretical treatments of dynamics on the O3(X1A') PES and perhaps of the PES itself in order to better understand and predict non-statistical effects in this reaction and in the formation of ozone (in which the intermediate O3* complex is collisionally stabilized by a third body). The scattering data presented here at two different collision energies provide important benchmarks to guide these improvements.
Zipkin, Elise F.; Leirness, Jeffery B.; Kinlan, Brian P.; O'Connell, Allan F.; Silverman, Emily D.
2014-01-01
Determining appropriate statistical distributions for modeling animal count data is important for accurate estimation of abundance, distribution, and trends. In the case of sea ducks along the U.S. Atlantic coast, managers want to estimate local and regional abundance to detect and track population declines, to define areas of high and low use, and to predict the impact of future habitat change on populations. In this paper, we used a modified marked point process to model survey data that recorded flock sizes of Common eiders, Long-tailed ducks, and Black, Surf, and White-winged scoters. The data come from an experimental aerial survey, conducted by the United States Fish & Wildlife Service (USFWS) Division of Migratory Bird Management, during which east-west transects were flown along the Atlantic Coast from Maine to Florida during the winters of 2009–2011. To model the number of flocks per transect (the points), we compared the fit of four statistical distributions (zero-inflated Poisson, zero-inflated geometric, zero-inflated negative binomial and negative binomial) to data on the number of species-specific sea duck flocks that were recorded for each transect flown. To model the flock sizes (the marks), we compared the fit of flock size data for each species to seven statistical distributions: positive Poisson, positive negative binomial, positive geometric, logarithmic, discretized lognormal, zeta and Yule–Simon. Akaike’s Information Criterion and Vuong’s closeness tests indicated that the negative binomial and discretized lognormal were the best distributions for all species for the points and marks, respectively. These findings have important implications for estimating sea duck abundances as the discretized lognormal is a more skewed distribution than the Poisson and negative binomial, which are frequently used to model avian counts; the lognormal is also less heavy-tailed than the power law distributions (e.g., zeta and Yule–Simon), which are becoming increasingly popular for group size modeling. Choosing appropriate statistical distributions for modeling flock size data is fundamental to accurately estimating population summaries, determining required survey effort, and assessing and propagating uncertainty through decision-making processes.
Shirazi, Mohammadali; Dhavala, Soma Sekhar; Lord, Dominique; Geedipally, Srinivas Reddy
2017-10-01
Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Deshayes, Yannick; Verdier, Frederic; Bechou, Laurent; Tregon, Bernard; Danto, Yves; Laffitte, Dominique; Goudard, Jean Luc
2004-09-01
High performance and high reliability are two of the most important goals driving the penetration of optical transmission into telecommunication systems ranging from 880 nm to 1550 nm. Lifetime prediction defined as the time at which a parameter reaches its maximum acceptable shirt still stays the main result in terms of reliability estimation for a technology. For optoelectronic emissive components, selection tests and life testing are specifically used for reliability evaluation according to Telcordia GR-468 CORE requirements. This approach is based on extrapolation of degradation laws, based on physics of failure and electrical or optical parameters, allowing both strong test time reduction and long-term reliability prediction. Unfortunately, in the case of mature technology, there is a growing complexity to calculate average lifetime and failure rates (FITs) using ageing tests in particular due to extremely low failure rates. For present laser diode technologies, time to failure tend to be 106 hours aged under typical conditions (Popt=10 mW and T=80°C). These ageing tests must be performed on more than 100 components aged during 10000 hours mixing different temperatures and drive current conditions conducting to acceleration factors above 300-400. These conditions are high-cost, time consuming and cannot give a complete distribution of times to failure. A new approach consists in use statistic computations to extrapolate lifetime distribution and failure rates in operating conditions from physical parameters of experimental degradation laws. In this paper, Distributed Feedback single mode laser diodes (DFB-LD) used for 1550 nm telecommunication network working at 2.5 Gbit/s transfer rate are studied. Electrical and optical parameters have been measured before and after ageing tests, performed at constant current, according to Telcordia GR-468 requirements. Cumulative failure rates and lifetime distributions are computed using statistic calculations and equations of drift mechanisms versus time fitted from experimental measurements.
Predicting objective function weights from patient anatomy in prostate IMRT treatment planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Taewoo, E-mail: taewoo.lee@utoronto.ca; Hammad, Muhannad; Chan, Timothy C. Y.
2013-12-15
Purpose: Intensity-modulated radiation therapy (IMRT) treatment planning typically combines multiple criteria into a single objective function by taking a weighted sum. The authors propose a statistical model that predicts objective function weights from patient anatomy for prostate IMRT treatment planning. This study provides a proof of concept for geometry-driven weight determination. Methods: A previously developed inverse optimization method (IOM) was used to generate optimal objective function weights for 24 patients using their historical treatment plans (i.e., dose distributions). These IOM weights were around 1% for each of the femoral heads, while bladder and rectum weights varied greatly between patients. Amore » regression model was developed to predict a patient's rectum weight using the ratio of the overlap volume of the rectum and bladder with the planning target volume at a 1 cm expansion as the independent variable. The femoral head weights were fixed to 1% each and the bladder weight was calculated as one minus the rectum and femoral head weights. The model was validated using leave-one-out cross validation. Objective values and dose distributions generated through inverse planning using the predicted weights were compared to those generated using the original IOM weights, as well as an average of the IOM weights across all patients. Results: The IOM weight vectors were on average six times closer to the predicted weight vectors than to the average weight vector, usingl{sub 2} distance. Likewise, the bladder and rectum objective values achieved by the predicted weights were more similar to the objective values achieved by the IOM weights. The difference in objective value performance between the predicted and average weights was statistically significant according to a one-sided sign test. For all patients, the difference in rectum V54.3 Gy, rectum V70.0 Gy, bladder V54.3 Gy, and bladder V70.0 Gy values between the dose distributions generated by the predicted weights and IOM weights was less than 5 percentage points. Similarly, the difference in femoral head V54.3 Gy values between the two dose distributions was less than 5 percentage points for all but one patient. Conclusions: This study demonstrates a proof of concept that patient anatomy can be used to predict appropriate objective function weights for treatment planning. In the long term, such geometry-driven weights may serve as a starting point for iterative treatment plan design or may provide information about the most clinically relevant region of the Pareto surface to explore.« less
Predicting objective function weights from patient anatomy in prostate IMRT treatment planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Taewoo, E-mail: taewoo.lee@utoronto.ca; Hammad, Muhannad; Chan, Timothy C. Y.
Purpose: Intensity-modulated radiation therapy (IMRT) treatment planning typically combines multiple criteria into a single objective function by taking a weighted sum. The authors propose a statistical model that predicts objective function weights from patient anatomy for prostate IMRT treatment planning. This study provides a proof of concept for geometry-driven weight determination. Methods: A previously developed inverse optimization method (IOM) was used to generate optimal objective function weights for 24 patients using their historical treatment plans (i.e., dose distributions). These IOM weights were around 1% for each of the femoral heads, while bladder and rectum weights varied greatly between patients. Amore » regression model was developed to predict a patient's rectum weight using the ratio of the overlap volume of the rectum and bladder with the planning target volume at a 1 cm expansion as the independent variable. The femoral head weights were fixed to 1% each and the bladder weight was calculated as one minus the rectum and femoral head weights. The model was validated using leave-one-out cross validation. Objective values and dose distributions generated through inverse planning using the predicted weights were compared to those generated using the original IOM weights, as well as an average of the IOM weights across all patients. Results: The IOM weight vectors were on average six times closer to the predicted weight vectors than to the average weight vector, usingl{sub 2} distance. Likewise, the bladder and rectum objective values achieved by the predicted weights were more similar to the objective values achieved by the IOM weights. The difference in objective value performance between the predicted and average weights was statistically significant according to a one-sided sign test. For all patients, the difference in rectum V54.3 Gy, rectum V70.0 Gy, bladder V54.3 Gy, and bladder V70.0 Gy values between the dose distributions generated by the predicted weights and IOM weights was less than 5 percentage points. Similarly, the difference in femoral head V54.3 Gy values between the two dose distributions was less than 5 percentage points for all but one patient. Conclusions: This study demonstrates a proof of concept that patient anatomy can be used to predict appropriate objective function weights for treatment planning. In the long term, such geometry-driven weights may serve as a starting point for iterative treatment plan design or may provide information about the most clinically relevant region of the Pareto surface to explore.« less
Modeling potential habitats for alien species Dreissena polymorpha in continental USA
Mingyang, Li; Yunwei, Ju; Kumar, Sunil; Stohlgren, Thomas J.
2008-01-01
The effective measure to minimize the damage of invasive species is to block the potential invasive species to enter into suitable areas. 1864 occurrence points with GPS coordinates and 34 environmental variables from Daymet datasets were gathered, and 4 modeling methods, i.e., Logistic Regression (LR), Classification and Regression Trees (CART), Genetic Algorithm for Rule-Set Prediction (GARP), and maximum entropy method (Maxent), were introduced to generate potential geographic distributions for invasive species Dreissena polymorpha in Continental USA. Then 3 statistical criteria of the area under the Receiver Operating Characteristic curve (AUC), Pearson correlation (COR) and Kappa value were calculated to evaluate the performance of the models, followed by analyses on major contribution variables. Results showed that in terms of the 3 statistical criteria, the prediction results of the 4 ecological niche models were either excellent or outstanding, in which Maxent outperformed the others in 3 aspects of predicting current distribution habitats, selecting major contribution factors, and quantifying the influence of environmental variables on habitats. Distance to water, elevation, frequency of precipitation and solar radiation were 4 environmental forcing factors. The method suggested in the paper can have some reference meaning for modeling habitats of alien species in China and provide a direction to prevent Mytilopsis sallei on the Chinese coast line.
Quintela-del-Río, Alejandro; Francisco-Fernández, Mario
2011-02-01
The study of extreme values and prediction of ozone data is an important topic of research when dealing with environmental problems. Classical extreme value theory is usually used in air-pollution studies. It consists in fitting a parametric generalised extreme value (GEV) distribution to a data set of extreme values, and using the estimated distribution to compute return levels and other quantities of interest. Here, we propose to estimate these values using nonparametric functional data methods. Functional data analysis is a relatively new statistical methodology that generally deals with data consisting of curves or multi-dimensional variables. In this paper, we use this technique, jointly with nonparametric curve estimation, to provide alternatives to the usual parametric statistical tools. The nonparametric estimators are applied to real samples of maximum ozone values obtained from several monitoring stations belonging to the Automatic Urban and Rural Network (AURN) in the UK. The results show that nonparametric estimators work satisfactorily, outperforming the behaviour of classical parametric estimators. Functional data analysis is also used to predict stratospheric ozone concentrations. We show an application, using the data set of mean monthly ozone concentrations in Arosa, Switzerland, and the results are compared with those obtained by classical time series (ARIMA) analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.
Zhang, X; Patel, L A; Beckwith, O; Schneider, R; Weeden, C J; Kindt, J T
2017-11-14
Micelle cluster distributions from molecular dynamics simulations of a solvent-free coarse-grained model of sodium octyl sulfate (SOS) were analyzed using an improved method to extract equilibrium association constants from small-system simulations containing one or two micelle clusters at equilibrium with free surfactants and counterions. The statistical-thermodynamic and mathematical foundations of this partition-enabled analysis of cluster histograms (PEACH) approach are presented. A dramatic reduction in computational time for analysis was achieved through a strategy similar to the selector variable method to circumvent the need for exhaustive enumeration of the possible partitions of surfactants and counterions into clusters. Using statistics from a set of small-system (up to 60 SOS molecules) simulations as input, equilibrium association constants for micelle clusters were obtained as a function of both number of surfactants and number of associated counterions through a global fitting procedure. The resulting free energies were able to accurately predict micelle size and charge distributions in a large (560 molecule) system. The evolution of micelle size and charge with SOS concentration as predicted by the PEACH-derived free energies and by a phenomenological four-parameter model fit, along with the sensitivity of these predictions to variations in cluster definitions, are analyzed and discussed.
We have applied a statistical stream network (SSN) model to predict stream thermal metrics (summer monthly medians, growing season maximum magnitude and timing, and daily rates of change) across New England nontidal streams and rivers, excluding northern Maine watersheds that ext...
Outliers: A Potential Data Problem.
ERIC Educational Resources Information Center
Douzenis, Cordelia; Rakow, Ernest A.
Outliers, extreme data values relative to others in a sample, may distort statistics that assume internal levels of measurement and normal distribution. The outlier may be a valid value or an error. Several procedures are available for identifying outliers, and each may be applied to errors of prediction from the regression lines for utility in a…
The distribution of density in supersonic turbulence
NASA Astrophysics Data System (ADS)
Squire, Jonathan; Hopkins, Philip F.
2017-11-01
We propose a model for the statistics of the mass density in supersonic turbulence, which plays a crucial role in star formation and the physics of the interstellar medium (ISM). The model is derived by considering the density to be arranged as a collection of strong shocks of width ˜ M^{-2}, where M is the turbulent Mach number. With two physically motivated parameters, the model predicts all density statistics for M>1 turbulence: the density probability distribution and its intermittency (deviation from lognormality), the density variance-Mach number relation, power spectra and structure functions. For the proposed model parameters, reasonable agreement is seen between model predictions and numerical simulations, albeit within the large uncertainties associated with current simulation results. More generally, the model could provide a useful framework for more detailed analysis of future simulations and observational data. Due to the simple physical motivations for the model in terms of shocks, it is straightforward to generalize to more complex physical processes, which will be helpful in future more detailed applications to the ISM. We see good qualitative agreement between such extensions and recent simulations of non-isothermal turbulence.
NASA Technical Reports Server (NTRS)
Gyekenyesi, J. P.
1985-01-01
A computer program was developed for calculating the statistical fast fracture reliability and failure probability of ceramic components. The program includes the two-parameter Weibull material fracture strength distribution model, using the principle of independent action for polyaxial stress states and Batdorf's shear-sensitive as well as shear-insensitive crack theories, all for volume distributed flaws in macroscopically isotropic solids. Both penny-shaped cracks and Griffith cracks are included in the Batdorf shear-sensitive crack response calculations, using Griffith's maximum tensile stress or critical coplanar strain energy release rate criteria to predict mixed mode fracture. Weibull material parameters can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and fracture data. The reliability prediction analysis uses MSC/NASTRAN stress, temperature and volume output, obtained from the use of three-dimensional, quadratic, isoparametric, or axisymmetric finite elements. The statistical fast fracture theories employed, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.
Statistical distributions of earthquake numbers: consequence of branching process
NASA Astrophysics Data System (ADS)
Kagan, Yan Y.
2010-03-01
We discuss various statistical distributions of earthquake numbers. Previously, we derived several discrete distributions to describe earthquake numbers for the branching model of earthquake occurrence: these distributions are the Poisson, geometric, logarithmic and the negative binomial (NBD). The theoretical model is the `birth and immigration' population process. The first three distributions above can be considered special cases of the NBD. In particular, a point branching process along the magnitude (or log seismic moment) axis with independent events (immigrants) explains the magnitude/moment-frequency relation and the NBD of earthquake counts in large time/space windows, as well as the dependence of the NBD parameters on the magnitude threshold (magnitude of an earthquake catalogue completeness). We discuss applying these distributions, especially the NBD, to approximate event numbers in earthquake catalogues. There are many different representations of the NBD. Most can be traced either to the Pascal distribution or to the mixture of the Poisson distribution with the gamma law. We discuss advantages and drawbacks of both representations for statistical analysis of earthquake catalogues. We also consider applying the NBD to earthquake forecasts and describe the limits of the application for the given equations. In contrast to the one-parameter Poisson distribution so widely used to describe earthquake occurrence, the NBD has two parameters. The second parameter can be used to characterize clustering or overdispersion of a process. We determine the parameter values and their uncertainties for several local and global catalogues, and their subdivisions in various time intervals, magnitude thresholds, spatial windows, and tectonic categories. The theoretical model of how the clustering parameter depends on the corner (maximum) magnitude can be used to predict future earthquake number distribution in regions where very large earthquakes have not yet occurred.
A statistical approach to nuclear fuel design and performance
NASA Astrophysics Data System (ADS)
Cunning, Travis Andrew
As CANDU fuel failures can have significant economic and operational consequences on the Canadian nuclear power industry, it is essential that factors impacting fuel performance are adequately understood. Current industrial practice relies on deterministic safety analysis and the highly conservative "limit of operating envelope" approach, where all parameters are assumed to be at their limits simultaneously. This results in a conservative prediction of event consequences with little consideration given to the high quality and precision of current manufacturing processes. This study employs a novel approach to the prediction of CANDU fuel reliability. Probability distributions are fitted to actual fuel manufacturing datasets provided by Cameco Fuel Manufacturing, Inc. They are used to form input for two industry-standard fuel performance codes: ELESTRES for the steady-state case and ELOCA for the transient case---a hypothesized 80% reactor outlet header break loss of coolant accident. Using a Monte Carlo technique for input generation, 105 independent trials are conducted and probability distributions are fitted to key model output quantities. Comparing model output against recognized industrial acceptance criteria, no fuel failures are predicted for either case. Output distributions are well removed from failure limit values, implying that margin exists in current fuel manufacturing and design. To validate the results and attempt to reduce the simulation burden of the methodology, two dimensional reduction methods are assessed. Using just 36 trials, both methods are able to produce output distributions that agree strongly with those obtained via the brute-force Monte Carlo method, often to a relative discrepancy of less than 0.3% when predicting the first statistical moment, and a relative discrepancy of less than 5% when predicting the second statistical moment. In terms of global sensitivity, pellet density proves to have the greatest impact on fuel performance, with an average sensitivity index of 48.93% on key output quantities. Pellet grain size and dish depth are also significant contributors, at 31.53% and 13.46%, respectively. A traditional limit of operating envelope case is also evaluated. This case produces output values that exceed the maximum values observed during the 105 Monte Carlo trials for all output quantities of interest. In many cases the difference between the predictions of the two methods is very prominent, and the highly conservative nature of the deterministic approach is demonstrated. A reliability analysis of CANDU fuel manufacturing parametric data, specifically pertaining to the quantification of fuel performance margins, has not been conducted previously. Key Words: CANDU, nuclear fuel, Cameco, fuel manufacturing, fuel modelling, fuel performance, fuel reliability, ELESTRES, ELOCA, dimensional reduction methods, global sensitivity analysis, deterministic safety analysis, probabilistic safety analysis.
Scheid, Anika; Nebel, Markus E
2012-07-09
Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
2012-01-01
Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
Lee II, Henry; Reusser, Deborah A.; Frazier, Melanie R; McCoy, Lee M; Clinton, Patrick J.; Clough, Jonathan S.
2014-01-01
The “Sea‐Level Affecting Marshes Model” (SLAMM) is a moderate resolution model used to predict the effects of sea level rise on marsh habitats (Craft et al. 2009). SLAMM has been used extensively on both the west coast (e.g., Glick et al., 2007) and east coast (e.g., Geselbracht et al., 2011) of the United States to evaluate potential changes in the distribution and extent of tidal marsh habitats. However, a limitation of the current version of SLAMM, (Version 6.2) is that it lacks the ability to model distribution changes in seagrass habitat resulting from sea level rise. Because of the ecological importance of SAV habitats, U.S. EPA, USGS, and USDA partnered with Warren Pinnacle Consulting to enhance the SLAMM modeling software to include new functionality in order to predict changes in Zostera marina distribution within Pacific Northwest estuaries in response to sea level rise. Specifically, the objective was to develop a SAV model that used generally available GIS data and parameters that were predictive and that could be customized for other estuaries that have GIS layers of existing SAV distribution. This report describes the procedure used to develop the SAV model for the Yaquina Bay Estuary, Oregon, appends a statistical script based on the open source R software to generate a similar SAV model for other estuaries that have data layers of existing SAV, and describes how to incorporate the model coefficients from the site‐specific SAV model into SLAMM to predict the effects of sea level rise on Zostera marina distributions. To demonstrate the applicability of the R tools, we utilize them to develop model coefficients for Willapa Bay, Washington using site‐specific SAV data.
Universal characteristics of fractal fluctuations in prime number distribution
NASA Astrophysics Data System (ADS)
Selvam, A. M.
2014-11-01
The frequency of occurrence of prime numbers at unit number spacing intervals exhibits self-similar fractal fluctuations concomitant with inverse power law form for power spectrum generic to dynamical systems in nature such as fluid flows, stock market fluctuations and population dynamics. The physics of long-range correlations exhibited by fractals is not yet identified. A recently developed general systems theory visualizes the eddy continuum underlying fractals to result from the growth of large eddies as the integrated mean of enclosed small scale eddies, thereby generating a hierarchy of eddy circulations or an inter-connected network with associated long-range correlations. The model predictions are as follows: (1) The probability distribution and power spectrum of fractals follow the same inverse power law which is a function of the golden mean. The predicted inverse power law distribution is very close to the statistical normal distribution for fluctuations within two standard deviations from the mean of the distribution. (2) Fractals signify quantum-like chaos since variance spectrum represents probability density distribution, a characteristic of quantum systems such as electron or photon. (3) Fractal fluctuations of frequency distribution of prime numbers signify spontaneous organization of underlying continuum number field into the ordered pattern of the quasiperiodic Penrose tiling pattern. The model predictions are in agreement with the probability distributions and power spectra for different sets of frequency of occurrence of prime numbers at unit number interval for successive 1000 numbers. Prime numbers in the first 10 million numbers were used for the study.
NASA Astrophysics Data System (ADS)
ten Veldhuis, Marie-Claire; Schleiss, Marc
2017-04-01
Urban catchments are typically characterised by a more flashy nature of the hydrological response compared to natural catchments. Predicting flow changes associated with urbanisation is not straightforward, as they are influenced by interactions between impervious cover, basin size, drainage connectivity and stormwater management infrastructure. In this study, we present an alternative approach to statistical analysis of hydrological response variability and basin flashiness, based on the distribution of inter-amount times. We analyse inter-amount time distributions of high-resolution streamflow time series for 17 (semi-)urbanised basins in North Carolina, USA, ranging from 13 to 238 km2 in size. We show that in the inter-amount-time framework, sampling frequency is tuned to the local variability of the flow pattern, resulting in a different representation and weighting of high and low flow periods in the statistical distribution. This leads to important differences in the way the distribution quantiles, mean, coefficient of variation and skewness vary across scales and results in lower mean intermittency and improved scaling. Moreover, we show that inter-amount-time distributions can be used to detect regulation effects on flow patterns, identify critical sampling scales and characterise flashiness of hydrological response. The possibility to use both the classical approach and the inter-amount-time framework to identify minimum observable scales and analyse flow data opens up interesting areas for future research.
Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent
2015-08-18
Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. Copyright © 2015 Montesinos-López et al.
Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram
2018-05-01
DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
Gaussian covariance graph models accounting for correlated marker effects in genome-wide prediction.
Martínez, C A; Khare, K; Rahman, S; Elzo, M A
2017-10-01
Several statistical models used in genome-wide prediction assume uncorrelated marker allele substitution effects, but it is known that these effects may be correlated. In statistics, graphical models have been identified as a useful tool for covariance estimation in high-dimensional problems and it is an area that has recently experienced a great expansion. In Gaussian covariance graph models (GCovGM), the joint distribution of a set of random variables is assumed to be Gaussian and the pattern of zeros of the covariance matrix is encoded in terms of an undirected graph G. In this study, methods adapting the theory of GCovGM to genome-wide prediction were developed (Bayes GCov, Bayes GCov-KR and Bayes GCov-H). In simulated data sets, improvements in correlation between phenotypes and predicted breeding values and accuracies of predicted breeding values were found. Our models account for correlation of marker effects and permit to accommodate general structures as opposed to models proposed in previous studies, which consider spatial correlation only. In addition, they allow incorporation of biological information in the prediction process through its use when constructing graph G, and their extension to the multi-allelic loci case is straightforward. © 2017 Blackwell Verlag GmbH.
Koelsch, Stefan; Busch, Tobias; Jentschke, Sebastian; Rohrmeier, Martin
2016-02-02
Within the framework of statistical learning, many behavioural studies investigated the processing of unpredicted events. However, surprisingly few neurophysiological studies are available on this topic, and no statistical learning experiment has investigated electroencephalographic (EEG) correlates of processing events with different transition probabilities. We carried out an EEG study with a novel variant of the established statistical learning paradigm. Timbres were presented in isochronous sequences of triplets. The first two sounds of all triplets were equiprobable, while the third sound occurred with either low (10%), intermediate (30%), or high (60%) probability. Thus, the occurrence probability of the third item of each triplet (given the first two items) was varied. Compared to high-probability triplet endings, endings with low and intermediate probability elicited an early anterior negativity that had an onset around 100 ms and was maximal at around 180 ms. This effect was larger for events with low than for events with intermediate probability. Our results reveal that, when predictions are based on statistical learning, events that do not match a prediction evoke an early anterior negativity, with the amplitude of this mismatch response being inversely related to the probability of such events. Thus, we report a statistical mismatch negativity (sMMN) that reflects statistical learning of transitional probability distributions that go beyond auditory sensory memory capabilities.
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1978-01-01
Yearly, monthly, and time of day fade statistics are presented and characterized. A 19.04 GHz yearly fade distribution, corresponding to a second COMSTAR beacon frequency, is predicted using the concept of effective path length, disdrometer, and rain rate results. The yearly attenuation and rain rate distributions follow with good approximation log normal variations for most fade and rain rate levels. Attenuations were exceeded for the longest and shortest periods of times for all fades in August and February, respectively. The eight hour time period showing the maximum and minimum number of minutes over the year for which fades exceeded 12 db were approximately between 1600 to 2400, and 0400 to 1200 hours, respectively. In employing the predictive method for obtaining the 19.04 GHz fade distribution, it is demonstrated theoretically that the ratio of attenuations at two frequencies is minimally dependent of raindrop size distribution providing these frequencies are not widely separated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aaboud, M.; Aad, G.; Abbott, B.
We present single lepton and dilepton kinematic distributions measured in dileptonic tmore » $$\\bar{t}$$ events produced in 20.2fb - 1 of √s=8 TeV pp collisions recorded by the ATLAS experiment at the LHC. Both absolute and normalised differential cross-sections are measured, using events with an opposite-charge eμ pair and one or two b-tagged jets. Furthermore, the cross-sections are measured in a fiducial region corresponding to the detector acceptance for leptons, and are compared to the predictions from a variety of Monte Carlo event generators, as well as fixed-order QCD calculations, exploring the sensitivity of the cross-sections to the gluon parton distribution function. Some of the distributions are also sensitive to the top quark pole mass; a combined fit of NLO fixed-order predictions to all the measured distributions yields a top quark mass value of m$$pole\\atop{t}$$=173.2±0.9±0.8±1.2 GeV, where the three uncertainties arise from data statistics, experimental systematics, and theoretical sources.« less
Kappa Distribution in a Homogeneous Medium: Adiabatic Limit of a Super-diffusive Process?
NASA Astrophysics Data System (ADS)
Roth, I.
2015-12-01
The classical statistical theory predicts that an ergodic, weakly interacting system like charged particles in the presence of electromagnetic fields, performing Brownian motions (characterized by small range deviations in phase space and short-term microscopic memory), converges into the Gibbs-Boltzmann statistics. Observation of distributions with a kappa-power-law tails in homogeneous systems contradicts this prediction and necessitates a renewed analysis of the basic axioms of the diffusion process: characteristics of the transition probability density function (pdf) for a single interaction, with a possibility of non-Markovian process and non-local interaction. The non-local, Levy walk deviation is related to the non-extensive statistical framework. Particles bouncing along (solar) magnetic field with evolving pitch angles, phases and velocities, as they interact resonantly with waves, undergo energy changes at undetermined time intervals, satisfying these postulates. The dynamic evolution of a general continuous time random walk is determined by pdf of jumps and waiting times resulting in a fractional Fokker-Planck equation with non-integer derivatives whose solution is given by a Fox H-function. The resulting procedure involves the known, although not frequently used in physics fractional calculus, while the local, Markovian process recasts the evolution into the standard Fokker-Planck equation. Solution of the fractional Fokker-Planck equation with the help of Mellin transform and evaluation of its residues at the poles of its Gamma functions results in a slowly converging sum with power laws. It is suggested that these tails form the Kappa function. Gradual vs impulsive solar electron distributions serve as prototypes of this description.
Modeling of fiber orientation in viscous fluid flow with application to self-compacting concrete
NASA Astrophysics Data System (ADS)
Kolařík, Filip; Patzák, Bořek
2013-10-01
In recent years, unconventional concrete reinforcement is of growing popularity. Especially fiber reinforcement has very wide usage in high performance concretes like "Self Compacting Concrete" (SCC). The design of advanced tailor-made structures made of SCC can take advantage of anisotropic orientation of fibers. Tools for fiber orientation predictions can contribute to design of tailor made structure and allow to develop casting procedures that enable to achieve the desired fiber distribution and orientation. This paper deals with development and implementation of suitable tool for prediction of fiber orientation in a fluid based on the knowledge of the velocity field. Statistical approach to the topic is employed. Fiber orientation is described by a probability distribution of the fiber angle.
Geologic characteristics of benthic habitats in Glacier Bay, southeast Alaska
Harney, Jodi N.; Cochrane, Guy R.; Etherington, Lisa L.; Dartnell, Pete; Golden, Nadine E.; Chezar, Hank
2006-01-01
In April 2004, more than 40 hours of georeferenced submarine digital video was collected in water depths of 15-370 m in Glacier Bay to (1) ground-truth existing geophysical data (bathymetry and acoustic reflectance), (2) examine and record geologic characteristics of the sea floor, and (3) investigate the relation between substrate types and benthic communities, and (4) construct predictive maps of seafloor geomorphology and habitat distribution. Common substrates observed include rock, boulders, cobbles, rippled sand, bioturbated mud, and extensive beds of living horse mussels and scallops. Four principal sea-floor geomorphic types are distinguished by using video observations. Their distribution in lower and central Glacier Bay is predicted using a supervised, hierarchical decision-tree statistical classification of geophysical data.
Predicting spiral wave patterns from cell properties in a model of biological self-organization.
Geberth, Daniel; Hütt, Marc-Thorsten
2008-09-01
In many biological systems, biological variability (i.e., systematic differences between the system components) can be expected to outrank statistical fluctuations in the shaping of self-organized patterns. In principle, the distribution of single-element properties should thus allow predicting features of such patterns. For a mathematical model of a paradigmatic and well-studied pattern formation process, spiral waves of cAMP signaling in colonies of the slime mold Dictyostelium discoideum, we explore this possibility and observe a pronounced anticorrelation between spiral waves and cell properties (namely, the firing rate) and particularly a clustering of spiral wave tips in regions devoid of spontaneously firing (pacemaker) cells. Furthermore, we observe local inhomogeneities in the distribution of spiral chiralities, again induced by the pacemaker distribution. We show that these findings can be explained by a simple geometrical model of spiral wave generation.
Predicting spiral wave patterns from cell properties in a model of biological self-organization
NASA Astrophysics Data System (ADS)
Geberth, Daniel; Hütt, Marc-Thorsten
2008-09-01
In many biological systems, biological variability (i.e., systematic differences between the system components) can be expected to outrank statistical fluctuations in the shaping of self-organized patterns. In principle, the distribution of single-element properties should thus allow predicting features of such patterns. For a mathematical model of a paradigmatic and well-studied pattern formation process, spiral waves of cAMP signaling in colonies of the slime mold Dictyostelium discoideum, we explore this possibility and observe a pronounced anticorrelation between spiral waves and cell properties (namely, the firing rate) and particularly a clustering of spiral wave tips in regions devoid of spontaneously firing (pacemaker) cells. Furthermore, we observe local inhomogeneities in the distribution of spiral chiralities, again induced by the pacemaker distribution. We show that these findings can be explained by a simple geometrical model of spiral wave generation.
Statistical mechanics in the context of special relativity. II.
Kaniadakis, G
2005-09-01
The special relativity laws emerge as one-parameter (light speed) generalizations of the corresponding laws of classical physics. These generalizations, imposed by the Lorentz transformations, affect both the definition of the various physical observables (e.g., momentum, energy, etc.), as well as the mathematical apparatus of the theory. Here, following the general lines of [Phys. Rev. E 66, 056125 (2002)], we show that the Lorentz transformations impose also a proper one-parameter generalization of the classical Boltzmann-Gibbs-Shannon entropy. The obtained relativistic entropy permits us to construct a coherent and self-consistent relativistic statistical theory, preserving the main features of the ordinary statistical theory, which is recovered in the classical limit. The predicted distribution function is a one-parameter continuous deformation of the classical Maxwell-Boltzmann distribution and has a simple analytic form, showing power law tails in accordance with the experimental evidence. Furthermore, this statistical mechanics can be obtained as the stationary case of a generalized kinetic theory governed by an evolution equation obeying the H theorem and reproducing the Boltzmann equation of the ordinary kinetics in the classical limit.
Naro, Daniel; Rummel, Christian; Schindler, Kaspar; Andrzejak, Ralph G
2014-09-01
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
NASA Astrophysics Data System (ADS)
Naro, Daniel; Rummel, Christian; Schindler, Kaspar; Andrzejak, Ralph G.
2014-09-01
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
NASA Astrophysics Data System (ADS)
Choi, W.; Ho, C. H.
2015-12-01
Intense tropical cyclones (TCs) accompanying heavy rainfall and destructive wind gusts sometimes cause incredible socio-economic damages in the regions near their landfall. This study aims to analyze intense TC activities in the North Atlantic (NA) and the western North Pacific (WNP) basins and develop their track propensity seasonal prediction model. Considering that the number of TCs in the NA basin is much smaller than that in the WNP basin, different intensity criteria are used; category 1 and above for NA and category 3 and above for WNP based on Saffir-Simpson hurricane wind scale. By using a fuzzy clustering method, intense TC tracks in the NA and the WNP basins are classified into two and three representative patterns, respectively. Each pattern shows empirical relationships with climate variabilities such as sea surface temperature distribution associated with El Niño/La Niña or Atlantic Meridional Mode, Pacific decadal oscillation, upper and low level zonal wind, and strength of subtropical high. The hybrid statistical-dynamical method has been used to develop the seasonal prediction model for each pattern based on statistical relationships between the intense TC activity and seasonal averaged key predictors. The model performance is statistically assessed by cross validation for the training period (1982-2013) and has been applied for the 2014 and 2015 prediction. This study suggests applicability of this model to real prediction work and provide bridgehead of attempt for intense TC prediction.
Zounemat-Kermani, Mohammad; Ramezani-Charmahineh, Abdollah; Adamowski, Jan; Kisi, Ozgur
2018-06-13
Chlorination, the basic treatment utilized for drinking water sources, is widely used for water disinfection and pathogen elimination in water distribution networks. Thereafter, the proper prediction of chlorine consumption is of great importance in water distribution network performance. In this respect, data mining techniques-which have the ability to discover the relationship between dependent variable(s) and independent variables-can be considered as alternative approaches in comparison to conventional methods (e.g., numerical methods). This study examines the applicability of three key methods, based on the data mining approach, for predicting chlorine levels in four water distribution networks. ANNs (artificial neural networks, including the multi-layer perceptron neural network, MLPNN, and radial basis function neural network, RBFNN), SVM (support vector machine), and CART (classification and regression tree) methods were used to estimate the concentration of residual chlorine in distribution networks for three villages in Kerman Province, Iran. Produced water (flow), chlorine consumption, and residual chlorine were collected daily for 3 years. An assessment of the studied models using several statistical criteria (NSC, RMSE, R 2 , and SEP) indicated that, in general, MLPNN has the greatest capability for predicting chlorine levels followed by CART, SVM, and RBF-ANN. Weaker performance of the data-driven methods in the water distribution networks, in some cases, could be attributed to improper chlorination management rather than the methods' capability.
Teaching the principles of statistical dynamics
Ghosh, Kingshuk; Dill, Ken A.; Inamdar, Mandar M.; Seitaridou, Effrosyni; Phillips, Rob
2012-01-01
We describe a simple framework for teaching the principles that underlie the dynamical laws of transport: Fick’s law of diffusion, Fourier’s law of heat flow, the Newtonian viscosity law, and the mass-action laws of chemical kinetics. In analogy with the way that the maximization of entropy over microstates leads to the Boltzmann distribution and predictions about equilibria, maximizing a quantity that E. T. Jaynes called “caliber” over all the possible microtrajectories leads to these dynamical laws. The principle of maximum caliber also leads to dynamical distribution functions that characterize the relative probabilities of different microtrajectories. A great source of recent interest in statistical dynamics has resulted from a new generation of single-particle and single-molecule experiments that make it possible to observe dynamics one trajectory at a time. PMID:23585693
Teaching the principles of statistical dynamics.
Ghosh, Kingshuk; Dill, Ken A; Inamdar, Mandar M; Seitaridou, Effrosyni; Phillips, Rob
2006-02-01
We describe a simple framework for teaching the principles that underlie the dynamical laws of transport: Fick's law of diffusion, Fourier's law of heat flow, the Newtonian viscosity law, and the mass-action laws of chemical kinetics. In analogy with the way that the maximization of entropy over microstates leads to the Boltzmann distribution and predictions about equilibria, maximizing a quantity that E. T. Jaynes called "caliber" over all the possible microtrajectories leads to these dynamical laws. The principle of maximum caliber also leads to dynamical distribution functions that characterize the relative probabilities of different microtrajectories. A great source of recent interest in statistical dynamics has resulted from a new generation of single-particle and single-molecule experiments that make it possible to observe dynamics one trajectory at a time.
Statistical properties of Charney-Hasegawa-Mima zonal flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Johan, E-mail: anderson.johan@gmail.com; Botha, G. J. J.
2015-05-15
A theoretical interpretation of numerically generated probability density functions (PDFs) of intermittent plasma transport events in unforced zonal flows is provided within the Charney-Hasegawa-Mima (CHM) model. The governing equation is solved numerically with various prescribed density gradients that are designed to produce different configurations of parallel and anti-parallel streams. Long-lasting vortices form whose flow is governed by the zonal streams. It is found that the numerically generated PDFs can be matched with analytical predictions of PDFs based on the instanton method by removing the autocorrelations from the time series. In many instances, the statistics generated by the CHM dynamics relaxesmore » to Gaussian distributions for both the electrostatic and vorticity perturbations, whereas in areas with strong nonlinear interactions it is found that the PDFs are exponentially distributed.« less
Hydrological responses to dynamically and statistically downscaled climate model output
Wilby, R.L.; Hay, L.E.; Gutowski, W.J.; Arritt, R.W.; Takle, E.S.; Pan, Z.; Leavesley, G.H.; Clark, M.P.
2000-01-01
Daily rainfall and surface temperature series were simulated for the Animas River basin, Colorado using dynamically and statistically downscaled output from the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) re-analysis. A distributed hydrological model was then applied to the downscaled data. Relative to raw NCEP output, downscaled climate variables provided more realistic stimulations of basin scale hydrology. However, the results highlight the sensitivity of modeled processes to the choice of downscaling technique, and point to the need for caution when interpreting future hydrological scenarios.
Collins, Á B; Grant, J; Barrett, D; Doherty, M L; Hallinan, A; Mee, J F
2017-08-01
Schmallenberg virus (SBV) is transmitted by Culicoides spp. biting midges and can cause abortions and congenital malformations in ruminants and milk drop in dairy cattle. Estimating true within-herd seroprevalence is an essential component of efficient and cost-effective SBV surveillance programs. The objectives of this study were: (1) determine the correlation between bulk-tank milk (BTM)-ELISA results and within-herd seroprevalence, (2) evaluate the ability of BTM-ELISA results to predict within-herd seroprevalence and (3) explore the distributions of individual animal serology results using novel statistical methodology. BTM samples (n=24) and blood samples (n=4019) collected from all lactating cows contributing to the BTM in 26 Irish dairy herds (58-444 cows/herd) in 2014 located in a region exposed to SBV in 2012/2013, were analysed for SBV-specific antibodies using IDVet ® ELISA kits. The correlation between BTM-ELISA results and within-herd seroprevalence was determined by calculating Pearson's correlation coefficient. Linear regression models were used to assess the ability of BTM-ELISA results to predict within-herd seroprevalence. The distributions of individual animal serology results were explored by determining the empirical distribution functions (EDF) of the individual animal serum ELISA results in each herd. EDFs were compared pairwise across herds, using the Kolmogorov-Smirnov statistical test. Herds with similar BTM-ELISA results, herds with similar within-herd seroprevalence and herds with similar mean-herd serology ELISA results were stratified in order to explore their respective paired-herd EDF comparisons. Statistical significance was set at p<0.05. Twenty-two herds were BTM-ELISA-positive (within-herd seroprevalence 30.6-100%) and two herds were BTM-ELISA-negative (within-herd seroprevalence 10.7 and 16.2%) indicating BTM-ELISA-negative herds can have seropositive animals present. BTM-ELISA results were highly correlated (r=0.807, p<0.0001) with, and predictive of (R 2 =0.832, p<0.0001) of within-herd seroprevalence. Predictions were most accurate for upper-range BTM-ELISA antibody titres, while they were less accurate at higher and lower antibody titres. This is likely a result of the overall high within-herd seroprevalence. In herds with similar BTM-ELISA results 82% of the paired-herd EDF comparisons were significantly different. In herds with similar within-herd seroprevalence and in herds with similar mean-herd serology ELISA results, 46% and 47% of the paired-herd EDF comparisons were significantly different, respectively. These results demonstrate that BTM antibody titres are highly predictive of within-herd seroprevalence in an SBV exposed region. Furthermore, exploring the serum EDFs revealed that the variation observed in the predicted within-herd seroprevalence in the regression models is likely a result of individual animal variation in serum antibody titres in these herds. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Singer, Anja; Millat, Gerald; Staneva, Joanna; Kröncke, Ingrid
2017-03-01
Small-scale spatial distribution patterns of seven macrofauna species, seagrass beds and mixed mussel/oyster reefs were modelled for the Jade Bay (North Sea, Germany) in response to climatic and environmental scenarios (representing 2050). For the species distribution models four presence-absence modelling methods were merged within the ensemble forecasting platform 'biomod2'. The present spatial distribution (representing 2009) was modelled by statistically related species presences, true species absences and six high-resolution environmental grids. The future spatial distribution was then predicted in response to expected climate change-induced ongoing (1) sea-level rise and (2) water temperature increase. Between 2009 and 2050, the present and future prediction maps revealed a significant range gain for two macrofauna species (Macoma balthica, Tubificoides benedii), whereas the species' range sizes of five macrofauna species remained relatively stable across space and time. The predicted probability of occurrence (PO) of two macrofauna species (Cerastoderma edule, Scoloplos armiger) decreased significantly under the potential future habitat conditions. In addition, a clear seagrass bed extension (Zostera noltii) on the lower intertidal flats (mixed sediments) and a decrease in the PO of mixed Mytilus edulis/Crassostrea gigas reefs was predicted for 2050. Until the mid-21st century, our future climatic and environmental scenario revealed significant changes in the range sizes (gains-losses) and/or the PO (increases-decreases) for seven of the 10 modelled species at the study site.
Analytical performance evaluation of SAR ATR with inaccurate or estimated models
NASA Astrophysics Data System (ADS)
DeVore, Michael D.
2004-09-01
Hypothesis testing algorithms for automatic target recognition (ATR) are often formulated in terms of some assumed distribution family. The parameter values corresponding to a particular target class together with the distribution family constitute a model for the target's signature. In practice such models exhibit inaccuracy because of incorrect assumptions about the distribution family and/or because of errors in the assumed parameter values, which are often determined experimentally. Model inaccuracy can have a significant impact on performance predictions for target recognition systems. Such inaccuracy often causes model-based predictions that ignore the difference between assumed and actual distributions to be overly optimistic. This paper reports on research to quantify the effect of inaccurate models on performance prediction and to estimate the effect using only trained parameters. We demonstrate that for large observation vectors the class-conditional probabilities of error can be expressed as a simple function of the difference between two relative entropies. These relative entropies quantify the discrepancies between the actual and assumed distributions and can be used to express the difference between actual and predicted error rates. Focusing on the problem of ATR from synthetic aperture radar (SAR) imagery, we present estimators of the probabilities of error in both ideal and plug-in tests expressed in terms of the trained model parameters. These estimators are defined in terms of unbiased estimates for the first two moments of the sample statistic. We present an analytical treatment of these results and include demonstrations from simulated radar data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hudson, W.G.
Scapteriscus vicinus is the most important pest of turf and pasture grasses in Florida. This study develops a method of correlating sample results with true population density and provides the first quantitative information on spatial distribution and movement patterns of mole crickets. Three basic techniques for sampling mole crickets were compared: soil flushes, soil corer, and pitfall trapping. No statistical difference was found between the soil corer and soil flushing. Soil flushing was shown to be more sensitive to changes in population density than pitfall trapping. No technique was effective for sampling adults. Regression analysis provided a means of adjustingmore » for the effects of soil moisture and showed soil temperature to be unimportant in predicting efficiency of flush sampling. Cesium-137 was used to label females for subsequent location underground. Comparison of mean distance to nearest neighbor with the distance predicted by a random distribution model showed that the observed distance in the spring was significantly greater than hypothesized (Student's T-test, p < 0.05). Fall adult nearest neighbor distance was not different than predicted by the random distribution hypothesis.« less
NASA Astrophysics Data System (ADS)
Durand, Marc; Kraynik, Andrew M.; van Swol, Frank; Käfer, Jos; Quilliet, Catherine; Cox, Simon; Ataei Talebi, Shirin; Graner, François
2014-06-01
Bubble monolayers are model systems for experiments and simulations of two-dimensional packing problems of deformable objects. We explore the relation between the distributions of the number of bubble sides (topology) and the bubble areas (geometry) in the low liquid fraction limit. We use a statistical model [M. Durand, Europhys. Lett. 90, 60002 (2010), 10.1209/0295-5075/90/60002] which takes into account Plateau laws. We predict the correlation between geometrical disorder (bubble size dispersity) and topological disorder (width of bubble side number distribution) over an extended range of bubble size dispersities. Extensive data sets arising from shuffled foam experiments, surface evolver simulations, and cellular Potts model simulations all collapse surprisingly well and coincide with the model predictions, even at extremely high size dispersity. At moderate size dispersity, we recover our earlier approximate predictions [M. Durand, J. Kafer, C. Quilliet, S. Cox, S. A. Talebi, and F. Graner, Phys. Rev. Lett. 107, 168304 (2011), 10.1103/PhysRevLett.107.168304]. At extremely low dispersity, when approaching the perfectly regular honeycomb pattern, we study how both geometrical and topological disorders vanish. We identify a crystallization mechanism and explore it quantitatively in the case of bidisperse foams. Due to the deformability of the bubbles, foams can crystallize over a larger range of size dispersities than hard disks. The model predicts that the crystallization transition occurs when the ratio of largest to smallest bubble radii is 1.4.
Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura
2016-04-01
Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence ofT. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence ofT. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata(Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model ofT. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats forT. dimidiata hold promise for spatial targeting of integrated vector management. © The American Society of Tropical Medicine and Hygiene.
Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura
2016-01-01
Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence of T. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence of T. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata (Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model of T. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats for T. dimidiata hold promise for spatial targeting of integrated vector management. PMID:26856910
Sando, Roy; Chase, Katherine J.
2017-03-23
A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.
Kalan, Katja; Ivovic, Vladimir; Glasnovic, Peter; Buzan, Elena
2017-11-07
In Slovenia, two invasive mosquito species are present, Aedes albopictus (Skuse, 1895) (Diptera: Culicidae) and Aedes japonicus (Theobald, 1901) (Diptera: Culicidae). In this study, we examined their actual distribution and suitable habitats for new colonizations. Data from survey of species presence in 2013 and 2015, bioclimatic variables and altitude were used for the construction of predictive maps. We produced various models in Maxent software and tested two bioclimatic variable sets, WorldClim and CHELSA. For the variable selection of A. albopictus modeling we used statistical and expert knowledge-based approach, whereas for A. j. japonicus we used only a statistically based approach. The best performing models for both species were chosen according to AIC score-based evaluation. In 2 yr of sampling, A. albopictus was largely confined to the western half of Slovenia, whereas A. j. japonicus spread significantly and can be considered as an established species in a large part of the country. Comparison of models with WorldClim and CHELSA variables for both species showed models with CHELSA variables as a better tool for prediction. Finally, we validated the models performance in predicting distribution of species according to collected field data. Our study confirms that both species are co-occurring and are sympatric in a large part of the country area. The tested models could be used for future prevention of invasive mosquitoes spreading in other countries with similar bioclimatic conditions. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.
This study statistically analyzed a grain-size based additivity model that has been proposed to scale reaction rates and parameters from laboratory to field. The additivity model assumed that reaction properties in a sediment including surface area, reactive site concentration, reaction rate, and extent can be predicted from field-scale grain size distribution by linearly adding reaction properties for individual grain size fractions. This study focused on the statistical analysis of the additivity model with respect to reaction rate constants using multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment as an example. Experimental data of rate-limited U(VI) desorption in amore » stirred flow-cell reactor were used to estimate the statistical properties of multi-rate parameters for individual grain size fractions. The statistical properties of the rate constants for the individual grain size fractions were then used to analyze the statistical properties of the additivity model to predict rate-limited U(VI) desorption in the composite sediment, and to evaluate the relative importance of individual grain size fractions to the overall U(VI) desorption. The result indicated that the additivity model provided a good prediction of the U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model, and U(VI) desorption in individual grain size fractions have to be simulated in order to apply the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel size fraction (2-8mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less
Type I and II β-turns prediction using NMR chemical shifts.
Wang, Ching-Cheng; Lai, Wen-Chung; Chuang, Woei-Jer
2014-07-01
A method for predicting type I and II β-turns using nuclear magnetic resonance (NMR) chemical shifts is proposed. Isolated β-turn chemical-shift data were collected from 1,798 protein chains. One-dimensional statistical analyses on chemical-shift data of three classes β-turn (type I, II, and VIII) showed different distributions at four positions, (i) to (i + 3). Considering the central two residues of type I β-turns, the mean values of Cο, Cα, H(N), and N(H) chemical shifts were generally (i + 1) > (i + 2). The mean values of Cβ and Hα chemical shifts were (i + 1) < (i + 2). The distributions of the central two residues in type II and VIII β-turns were also distinguishable by trends of chemical shift values. Two-dimensional cluster analyses on chemical-shift data show positional distributions more clearly. Based on these propensities of chemical shift classified as a function of position, rules were derived using scoring matrices for four consecutive residues to predict type I and II β-turns. The proposed method achieves an overall prediction accuracy of 83.2 and 84.2% with the Matthews correlation coefficient values of 0.317 and 0.632 for type I and II β-turns, indicating that its higher accuracy for type II turn prediction. The results show that it is feasible to use NMR chemical shifts to predict the β-turn types in proteins. The proposed method can be incorporated into other chemical-shift based protein secondary structure prediction methods.
Radioactivity Registered With a Small Number of Events
NASA Astrophysics Data System (ADS)
Zlokazov, Victor; Utyonkov, Vladimir
2018-02-01
The synthesis of superheavy elements asks for the analysis of low statistics experimental data presumably obeying an unknown exponential distribution and to take the decision whether they originate from one source or have admixtures. Here we analyze predictions following from non-parametrical methods, employing only such fundamental sample properties as the sample mean, the median and the mode.
FAST COGNITIVE AND TASK ORIENTED, ITERATIVE DATA DISPLAY (FACTOID)
2017-06-01
approaches. As a result, the following assumptions guided our efforts in developing modeling and descriptive metrics for evaluation purposes...Application Evaluation . Our analytic workflow for evaluation is to first provide descriptive statistics about applications across metrics (performance...distributions for evaluation purposes because the goal of evaluation is accurate description , not inference (e.g., prediction). Outliers depicted
Rain Rate Statistics in Southern New Mexico
NASA Technical Reports Server (NTRS)
Paulic, Frank J., Jr.; Horan, Stephen
1997-01-01
The methodology used in determining empirical rain-rate distributions for Southern New Mexico in the vicinity of White Sands APT site is discussed. The hardware and the software developed to extract rain rate from the rain accumulation data collected at White Sands APT site are described. The accuracy of Crane's Global Model for rain rate predictions is analyzed.
Single photon counting linear mode avalanche photodiode technologies
NASA Astrophysics Data System (ADS)
Williams, George M.; Huntington, Andrew S.
2011-10-01
The false count rate of a single-photon-sensitive photoreceiver consisting of a high-gain, low-excess-noise linear-mode InGaAs avalanche photodiode (APD) and a high-bandwidth transimpedance amplifier (TIA) is fit to a statistical model. The peak height distribution of the APD's multiplied dark current is approximated by the weighted sum of McIntyre distributions, each characterizing dark current generated at a different location within the APD's junction. The peak height distribution approximated in this way is convolved with a Gaussian distribution representing the input-referred noise of the TIA to generate the statistical distribution of the uncorrelated sum. The cumulative distribution function (CDF) representing count probability as a function of detection threshold is computed, and the CDF model fit to empirical false count data. It is found that only k=0 McIntyre distributions fit the empirically measured CDF at high detection threshold, and that false count rate drops faster than photon count rate as detection threshold is raised. Once fit to empirical false count data, the model predicts the improvement of the false count rate to be expected from reductions in TIA noise and APD dark current. Improvement by at least three orders of magnitude is thought feasible with further manufacturing development and a capacitive-feedback TIA (CTIA).
May, Eric F; Lim, Vincent W; Metaxas, Peter J; Du, Jianwei; Stanwix, Paul L; Rowland, Darren; Johns, Michael L; Haandrikman, Gert; Crosby, Daniel; Aman, Zachary M
2018-03-13
Gas hydrate formation is a stochastic phenomenon of considerable significance for any risk-based approach to flow assurance in the oil and gas industry. In principle, well-established results from nucleation theory offer the prospect of predictive models for hydrate formation probability in industrial production systems. In practice, however, heuristics are relied on when estimating formation risk for a given flowline subcooling or when quantifying kinetic hydrate inhibitor (KHI) performance. Here, we present statistically significant measurements of formation probability distributions for natural gas hydrate systems under shear, which are quantitatively compared with theoretical predictions. Distributions with over 100 points were generated using low-mass, Peltier-cooled pressure cells, cycled in temperature between 40 and -5 °C at up to 2 K·min -1 and analyzed with robust algorithms that automatically identify hydrate formation and initial growth rates from dynamic pressure data. The application of shear had a significant influence on the measured distributions: at 700 rpm mass-transfer limitations were minimal, as demonstrated by the kinetic growth rates observed. The formation probability distributions measured at this shear rate had mean subcoolings consistent with theoretical predictions and steel-hydrate-water contact angles of 14-26°. However, the experimental distributions were substantially wider than predicted, suggesting that phenomena acting on macroscopic length scales are responsible for much of the observed stochastic formation. Performance tests of a KHI provided new insights into how such chemicals can reduce the risk of hydrate blockage in flowlines. Our data demonstrate that the KHI not only reduces the probability of formation (by both shifting and sharpening the distribution) but also reduces hydrate growth rates by a factor of 2.
NASA Astrophysics Data System (ADS)
Olurotimi, E. O.; Sokoya, O.; Ojo, J. S.; Owolawi, P. A.
2018-03-01
Rain height is one of the significant parameters for prediction of rain attenuation for Earth-space telecommunication links, especially those operating at frequencies above 10 GHz. This study examines Three-parameter Dagum distribution of the rain height over Durban, South Africa. 5-year data were used to study the monthly, seasonal, and annual variations using the parameters estimated by the maximum likelihood of the distribution. The performance estimation of the distribution was determined using the statistical goodness of fit. Three-parameter Dagum distribution shows an appropriate distribution for the modeling of rain height over Durban with the Root Mean Square Error of 0.26. Also, the shape and scale parameters for the distribution show a wide variation. The probability exceedance of time for 0.01% indicates the high probability of rain attenuation at higher frequencies.
NASA Astrophysics Data System (ADS)
Lin, Shu; Wang, Rui; Xia, Ning; Li, Yongdong; Liu, Chunliang
2018-01-01
Statistical multipactor theories are critical prediction approaches for multipactor breakdown determination. However, these approaches still require a negotiation between the calculation efficiency and accuracy. This paper presents an improved stationary statistical theory for efficient threshold analysis of two-surface multipactor. A general integral equation over the distribution function of the electron emission phase with both the single-sided and double-sided impacts considered is formulated. The modeling results indicate that the improved stationary statistical theory can not only obtain equally good accuracy of multipactor threshold calculation as the nonstationary statistical theory, but also achieve high calculation efficiency concurrently. By using this improved stationary statistical theory, the total time consumption in calculating full multipactor susceptibility zones of parallel plates can be decreased by as much as a factor of four relative to the nonstationary statistical theory. It also shows that the effect of single-sided impacts is indispensable for accurate multipactor prediction of coaxial lines and also more significant for the high order multipactor. Finally, the influence of secondary emission yield (SEY) properties on the multipactor threshold is further investigated. It is observed that the first cross energy and the energy range between the first cross and the SEY maximum both play a significant role in determining the multipactor threshold, which agrees with the numerical simulation results in the literature.
Bayesian inference for the spatio-temporal invasion of alien species.
Cook, Alex; Marion, Glenn; Butler, Adam; Gibson, Gavin
2007-08-01
In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions.
Poisson Statistics of Combinatorial Library Sampling Predict False Discovery Rates of Screening
2017-01-01
Microfluidic droplet-based screening of DNA-encoded one-bead-one-compound combinatorial libraries is a miniaturized, potentially widely distributable approach to small molecule discovery. In these screens, a microfluidic circuit distributes library beads into droplets of activity assay reagent, photochemically cleaves the compound from the bead, then incubates and sorts the droplets based on assay result for subsequent DNA sequencing-based hit compound structure elucidation. Pilot experimental studies revealed that Poisson statistics describe nearly all aspects of such screens, prompting the development of simulations to understand system behavior. Monte Carlo screening simulation data showed that increasing mean library sampling (ε), mean droplet occupancy, or library hit rate all increase the false discovery rate (FDR). Compounds identified as hits on k > 1 beads (the replicate k class) were much more likely to be authentic hits than singletons (k = 1), in agreement with previous findings. Here, we explain this observation by deriving an equation for authenticity, which reduces to the product of a library sampling bias term (exponential in k) and a sampling saturation term (exponential in ε) setting a threshold that the k-dependent bias must overcome. The equation thus quantitatively describes why each hit structure’s FDR is based on its k class, and further predicts the feasibility of intentionally populating droplets with multiple library beads, assaying the micromixtures for function, and identifying the active members by statistical deconvolution. PMID:28682059
NASA Astrophysics Data System (ADS)
Neri, Izaak; Roldán, Édgar; Jülicher, Frank
2017-01-01
We study the statistics of infima, stopping times, and passage probabilities of entropy production in nonequilibrium steady states, and we show that they are universal. We consider two examples of stopping times: first-passage times of entropy production and waiting times of stochastic processes, which are the times when a system reaches a given state for the first time. Our main results are as follows: (i) The distribution of the global infimum of entropy production is exponential with mean equal to minus Boltzmann's constant; (ii) we find exact expressions for the passage probabilities of entropy production; (iii) we derive a fluctuation theorem for stopping-time distributions of entropy production. These results have interesting implications for stochastic processes that can be discussed in simple colloidal systems and in active molecular processes. In particular, we show that the timing and statistics of discrete chemical transitions of molecular processes, such as the steps of molecular motors, are governed by the statistics of entropy production. We also show that the extreme-value statistics of active molecular processes are governed by entropy production; for example, we derive a relation between the maximal excursion of a molecular motor against the direction of an external force and the infimum of the corresponding entropy-production fluctuations. Using this relation, we make predictions for the distribution of the maximum backtrack depth of RNA polymerases, which follow from our universal results for entropy-production infima.
The landscape of W± and Z bosons produced in pp collisions up to LHC energies
NASA Astrophysics Data System (ADS)
Basso, Eduardo; Bourrely, Claude; Pasechnik, Roman; Soffer, Jacques
2017-10-01
We consider a selection of recent experimental results on electroweak W± , Z gauge boson production in pp collisions at BNL RHIC and CERN LHC energies in comparison to prediction of perturbative QCD calculations based on different sets of NLO parton distribution functions including the statistical PDF model known from fits to the DIS data. We show that the current statistical PDF parametrization (fitted to the DIS data only) underestimates the LHC data on W± , Z gauge boson production cross sections at the NLO by about 20%. This suggests that there is a need to refit the parameters of the statistical PDF including the latest LHC data.
NASA Astrophysics Data System (ADS)
Scheuerer, Michael; Hamill, Thomas M.; Whitin, Brett; He, Minxue; Henkel, Arthur
2017-04-01
Hydrological forecasts strongly rely on predictions of precipitation amounts and temperature as meteorological inputs to hydrological models. Ensemble weather predictions provide a number of different scenarios that reflect the uncertainty about these meteorological inputs, but are often biased and underdispersive, and therefore require statistical postprocessing. In hydrological applications it is crucial that spatial and temporal (i.e. between different forecast lead times) dependencies as well as dependence between the two weather variables is adequately represented by the recalibrated forecasts. We present a study with temperature and precipitation forecasts over four river basins over California that are postprocessed with a variant of the nonhomogeneous Gaussian regression method (Gneiting et al., 2005) and the censored, shifted gamma distribution approach (Scheuerer and Hamill, 2015) respectively. For modelling spatial, temporal and inter-variable dependence we propose a variant of the Schaake Shuffle (Clark et al., 2005) that uses spatio-temporal trajectories of observed temperture and precipitation as a dependence template, and chooses the historic dates in such a way that the divergence between the marginal distributions of these trajectories and the univariate forecast distributions is minimized. For the four river basins considered in our study, this new multivariate modelling technique consistently improves upon the Schaake Shuffle and yields reliable spatio-temporal forecast trajectories of temperature and precipitation that can be used to force hydrological forecast systems. References: Clark, M., Gangopadhyay, S., Hay, L., Rajagopalan, B., Wilby, R., 2004. The Schaake Shuffle: A method for reconstructing space-time variability in forecasted precipitation and temperature fields. Journal of Hydrometeorology, 5, pp.243-262. Gneiting, T., Raftery, A.E., Westveld, A.H., Goldman, T., 2005. Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS. Monthly Weather Review, 133, pp.1098-1118. Scheuerer, M., Hamill, T.M., 2015. Statistical postprocessing of ensemble precipitation forecasts by fitting censored, shifted gamma distributions. Monthly Weather Review, 143, pp.4578-4596. Scheuerer, M., Hamill, T.M., Whitin, B., He, M., and Henkel, A., 2016: A method for preferential selection of dates in the Schaake shuffle approach to constructing spatio-temporal forecast fields of temperature and precipitation. Water Resources Research, submitted.
Extreme statistics and index distribution in the classical 1d Coulomb gas
NASA Astrophysics Data System (ADS)
Dhar, Abhishek; Kundu, Anupam; Majumdar, Satya N.; Sabhapandit, Sanjib; Schehr, Grégory
2018-07-01
We consider a 1D gas of N charged particles confined by an external harmonic potential and interacting via the 1D Coulomb potential. For this system we show that in equilibrium the charges settle, on an average, uniformly and symmetrically on a finite region centred around the origin. We study the statistics of the position of the rightmost particle and show that the limiting distribution describing its typical fluctuations is different from the Tracy–Widom distribution found in the 1D log-gas. We also compute the large deviation functions which characterise the atypical fluctuations of far away from its mean value. In addition, we study the gap between the two rightmost particles as well as the index N + , i.e. the number of particles on the positive semi-axis. We compute the limiting distributions associated to the typical fluctuations of these observables as well as the corresponding large deviation functions. We provide numerical supports to our analytical predictions. Part of these results were announced in a recent letter, Dhar et al (2017 Phys. Rev. Lett. 119 060601).
Stepaniak, Pieter S; Heij, Christiaan; Mannaerts, Guido H H; de Quelerij, Marcel; de Vries, Guus
2009-10-01
Gains in operating room (OR) scheduling may be obtained by using accurate statistical models to predict surgical and procedure times. The 3 main contributions of this article are the following: (i) the validation of Strum's results on the statistical distribution of case durations, including surgeon effects, using OR databases of 2 European hospitals, (ii) the use of expert prior expectations to predict durations of rarely observed cases, and (iii) the application of the proposed methods to predict case durations, with an analysis of the resulting increase in OR efficiency. We retrospectively reviewed all recorded surgical cases of 2 large European teaching hospitals from 2005 to 2008, involving 85,312 cases and 92,099 h in total. Surgical times tended to be skewed and bounded by some minimally required time. We compared the fit of the normal distribution with that of 2- and 3-parameter lognormal distributions for case durations of a range of Current Procedural Terminology (CPT)-anesthesia combinations, including possible surgeon effects. For cases with very few observations, we investigated whether supplementing the data information with surgeons' prior guesses helps to obtain better duration estimates. Finally, we used best fitting duration distributions to simulate the potential efficiency gains in OR scheduling. The 3-parameter lognormal distribution provides the best results for the case durations of CPT-anesthesia (surgeon) combinations, with an acceptable fit for almost 90% of the CPTs when segmented by the factor surgeon. The fit is best for surgical times and somewhat less for total procedure times. Surgeons' prior guesses are helpful for OR management to improve duration estimates of CPTs with very few (<10) observations. Compared with the standard way of case scheduling using the mean of the 3-parameter lognormal distribution for case scheduling reduces the mean overreserved OR time per case up to 11.9 (11.8-12.0) min (55.6%) and the mean underreserved OR time per case up to 16.7 (16.5-16.8) min (53.1%). When scheduling cases using the 4-parameter lognormal model the mean overutilized OR time is up to 20.0 (19.7-20.3) min per OR per day lower than for the standard method and 11.6 (11.3-12.0) min per OR per day lower as compared with the biased corrected mean. OR case scheduling can be improved by using the 3-parameter lognormal model with surgeon effects and by using surgeons' prior guesses for rarely observed CPTs. Using the 3-parameter lognormal model for case-duration prediction and scheduling significantly reduces both the prediction error and OR inefficiency.
Valero Juan, L F; Sáenz González, M C
1997-11-01
The maternal mortality evolution in Spain during the 1980-1992 period is reported. The influence of birth distribution according to maternal age is analyzed. The information was gathered from vital statistics published by Instituto Nacional de Estadística. The mortality rates have stabilized since 1985 (4.8 per 10(5) for 1992) associated with the increase in the proportion of births in women aged > or = 30 years (40.6% for 1992). Birth distributions according to maternal age account for 13.1% of the deaths observed. The predictions point to an increase in maternal mortality for the year 2000.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
NASA Astrophysics Data System (ADS)
Crosby, N.; Georgoulis, M.; Vilmer, N.
1999-10-01
Solar burst observations in the deka-keV energy range originating from the WATCH experiment aboard the GRANAT spacecraft were used to perform frequency distributions built on measured X-ray flare parameters (Crosby et al., 1998). The results of the study show that: 1- the overall distribution functions are robust power laws extending over a number of decades. The typical parameters of events (total counts, peak count rates, duration) are all correlated to each other. 2- the overall distribution functions are the convolution of significantly different distribution functions built on parts of the whole data set filtered by the event duration. These "partial" frequency distributions are still power law distributions over several decades, with a slope systematically decreasing with increasing duration. 3- No correlation is found between the elapsed time interval between successive bursts arising from the same active region and the peak intensity of the flare. In this paper, we attempt a tentative comparison between the statistical properties of the self-organized critical (SOC) cellular automaton statistical flare models (see e.g. Lu and Hamilton (1991), Georgoulis and Vlahos (1996, 1998)) and the respective properties of the WATCH flare data. Despite the inherent weaknesses of the SOC models to simulate a number of physical processes in the active region, it is found that most of the observed statistical properties can be reproduced using the SOC models, including the various frequency distributions and scatter plots. We finally conclude that, even if SOC models must be refined to improve the physical links to MHD approaches, they nevertheless represent a good approach to describe the properties of rapid energy dissipation and magnetic field annihilation in complex and magnetized plasmas. Crosby N., Vilmer N., Lund N. and Sunyaev R., A&A; 334; 299-313; 1998 Crosby N., Lund N., Vilmer N. and Sunyaev R.; A&A Supplement Series; 130, 233, 1998 Georgoulis M. and Vlahos L., 1996, Astrophy. J. Letters, 469, L135 Georgoulis M. and Vlahos L., 1998, in preparation Lu E.T. and Hamilton R.J., 1991, Astroph. J., 380, L89
Modelling the hydraulic conductivity of porous media using physical-statistical model
NASA Astrophysics Data System (ADS)
Usowicz, B.; Usowicz, L. B.; Lipiec, J.
2009-04-01
Soils and other porous media can be represented by a pattern (net) of more or less cylindrically interconnected channels. The capillary radius, r can represent an elementary capillary formed in between soil particles in one case, and in another case it can represent a mean hydrodynamic radius. When we view a porous medium as a net of interconnected capillaries, we can apply a statistical approach for the description of the liquid or gas flow. A soil phase is included in the porous medium and its configuration is decisive for pore distribution in this medium and hence, it conditions the course of the water retention curve of this medium. In this work method of estimating hydraulic conductivity of porous media based on physical-statistical model proposed by B. Usowicz is presented. The physical-statistical model considers the pore space as the capillary net. The net of capillary connections is represented by parallel and serial connections of hydraulic resistors in the layer and between the layers, respectively. The polynomial distribution was used in this model to determine probability of the occurrence of a given capillary configuration. The model was calibrated using measured water retention curve and two values of hydraulic conductivity saturated and unsaturated and model parameters were determined. The model was used for predicting hydraulic conductivity as a function of soil water content K(theta). The model was validated by comparing the measured and predicted K data for various soils and other porous media (e.g. sandstone). A good agreement between measured and predicted data was reasonable as indicated by values R2 (>0.9). It was also confirmed that the random variables used for the calculations and model parameters were chosen and selected correctly. The study was funded in part by the Polish Ministry of Science and Higher Education by Grant No. N305 046 31/1707).
NASA Astrophysics Data System (ADS)
Eltom, Hassan A.; Abdullatif, Osman M.; Makkawi, Mohammed H.; Eltoum, Isam-Eldin A.
2017-03-01
The interpretation of depositional environments provides important information to understand facies distribution and geometry. The classical approach to interpret depositional environments principally relies on the analysis of lithofacies, biofacies and stratigraphic data, among others. An alternative method, based on geochemical data (chemical element data), is advantageous because it can simply, reproducibly and efficiently interpret and refine the interpretation of the depositional environment of carbonate strata. Here we geochemically analyze and statistically model carbonate samples (n = 156) from seven sections of the Arab-D reservoir outcrop analog of central Saudi Arabia, to determine whether the elemental signatures (major, trace and rare earth elements [REEs]) can be effectively used to predict depositional environments. We find that lithofacies associations of the studied outcrop (peritidal to open marine depositional environments) possess altered REE signatures, and that this trend increases stratigraphically from bottom-to-top, which corresponds to an upward shallowing of depositional environments. The relationship between REEs and major, minor and trace elements indicates that contamination by detrital materials is the principal source of REEs, whereas redox condition, marine and diagenetic processes have minimal impact on the relative distribution of REEs in the lithofacies. In a statistical model (factor analysis and logistic regression), REEs, major and trace elements cluster together and serve as markers to differentiate between peritidal and open marine facies and to differentiate between intertidal and subtidal lithofacies within the peritidal facies. The results indicate that statistical modelling of the elemental composition of carbonate strata can be used as a quantitative method to predict depositional environments and regional paleogeography. The significance of this study lies in offering new assessments of the relationships between lithofacies and geochemical elements by using advanced statistical analysis, a method that could be used elsewhere to interpret depositional environment and refine facies models.
Scale relativity and hierarchical structuring of planetary systems
NASA Astrophysics Data System (ADS)
Galopeau, P. H. M.; Nottale, L.; da Rocha, D.; Tran Minh, N.
2003-04-01
The theory of scale relativity, applied to macroscopic gravitational systems like planetary systems, allows one to predict quantization laws of several key parameters characterizing those systems (distance between planets and central star, obliquity, eccentricity...) which are organized in a hierarchical way. In the framework of the scale relativity approach, one demonstrates that the motion (at relatively large time-scales) of the bodies in planetary systems, described in terms of fractal geodesic trajectories, is governed by a Schrödinger-like equation. Preferential orbits are predicted in terms of probability density peaks with semi-major axis given by: a_n = GMn^2/w^2 (M is the mass of the central star and w is a velocity close to 144 km s-1 in the case of our inner solar system and of the presently observed exoplanets). The velocity of the planet orbiting at this distance satisfies the relation v_n = w/n. Moreover, the mass distribution of the planets in our solar system can be accounted for in this model. These predictions are in good agreement with the observed values of the actual orbital parameters. Furthermore, the exoplanets which have been recently discovered around nearby stars also follow the same law in terms of the same constant in a highly significant statistical way. The theory of scale relativity also predicts structures for the obliquities and inclinations of the planets and satellites: the probability density of their distribution between 0 and pi are expected to display peaks at particular angles θ_k = kpi/n. A statistical agreement is obtained for our solar system with n=7. Another prediction concerns the distribution of the planets eccentricities e. The theory foresees a quantization law e = k/n where k is an integer and n is the quantum number that characterizes semi-major axes. The presently known exoplanet eccentricities are compatible with this theoretical prediction. Finally, although all these planetary systems may look very different from our solar system, they actually present universal structures comparable to ours, so that a high probability to discover exoplanets having orbital characteristics very similar to the Earth's ones can be expected.
Finding equilibrium in the spatiotemporal chaos of the complex Ginzburg-Landau equation
NASA Astrophysics Data System (ADS)
Ballard, Christopher C.; Esty, C. Clark; Egolf, David A.
2016-11-01
Equilibrium statistical mechanics allows the prediction of collective behaviors of large numbers of interacting objects from just a few system-wide properties; however, a similar theory does not exist for far-from-equilibrium systems exhibiting complex spatial and temporal behavior. We propose a method for predicting behaviors in a broad class of such systems and apply these ideas to an archetypal example, the spatiotemporal chaotic 1D complex Ginzburg-Landau equation in the defect chaos regime. Building on the ideas of Ruelle and of Cross and Hohenberg that a spatiotemporal chaotic system can be considered a collection of weakly interacting dynamical units of a characteristic size, the chaotic length scale, we identify underlying, mesoscale, chaotic units and effective interaction potentials between them. We find that the resulting equilibrium Takahashi model accurately predicts distributions of particle numbers. These results suggest the intriguing possibility that a class of far-from-equilibrium systems may be well described at coarse-grained scales by the well-established theory of equilibrium statistical mechanics.
NASA Astrophysics Data System (ADS)
Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi
2012-07-01
The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
Finding equilibrium in the spatiotemporal chaos of the complex Ginzburg-Landau equation.
Ballard, Christopher C; Esty, C Clark; Egolf, David A
2016-11-01
Equilibrium statistical mechanics allows the prediction of collective behaviors of large numbers of interacting objects from just a few system-wide properties; however, a similar theory does not exist for far-from-equilibrium systems exhibiting complex spatial and temporal behavior. We propose a method for predicting behaviors in a broad class of such systems and apply these ideas to an archetypal example, the spatiotemporal chaotic 1D complex Ginzburg-Landau equation in the defect chaos regime. Building on the ideas of Ruelle and of Cross and Hohenberg that a spatiotemporal chaotic system can be considered a collection of weakly interacting dynamical units of a characteristic size, the chaotic length scale, we identify underlying, mesoscale, chaotic units and effective interaction potentials between them. We find that the resulting equilibrium Takahashi model accurately predicts distributions of particle numbers. These results suggest the intriguing possibility that a class of far-from-equilibrium systems may be well described at coarse-grained scales by the well-established theory of equilibrium statistical mechanics.
Magarey, Roger; Newton, Leslie; Hong, Seung C.; Takeuchi, Yu; Christie, Dave; Jarnevich, Catherine S.; Kohl, Lisa; Damus, Martin; Higgins, Steven I.; Miller, Leah; Castro, Karen; West, Amanda; Hastings, John; Cook, Gericke; Kartesz, John; Koop, Anthony
2018-01-01
This study compares four models for predicting the potential distribution of non-indigenous weed species in the conterminous U.S. The comparison focused on evaluating modeling tools and protocols as currently used for weed risk assessment or for predicting the potential distribution of invasive weeds. We used six weed species (three highly invasive and three less invasive non-indigenous species) that have been established in the U.S. for more than 75 years. The experiment involved providing non-U. S. location data to users familiar with one of the four evaluated techniques, who then developed predictive models that were applied to the United States without knowing the identity of the species or its U.S. distribution. We compared a simple GIS climate matching technique known as Proto3, a simple climate matching tool CLIMEX Match Climates, the correlative model MaxEnt, and a process model known as the Thornley Transport Resistance (TTR) model. Two experienced users ran each modeling tool except TTR, which had one user. Models were trained with global species distribution data excluding any U.S. data, and then were evaluated using the current known U.S. distribution. The influence of weed species identity and modeling tool on prevalence and sensitivity effects was compared using a generalized linear mixed model. Each modeling tool itself had a low statistical significance, while weed species alone accounted for 69.1 and 48.5% of the variance for prevalence and sensitivity, respectively. These results suggest that simple modeling tools might perform as well as complex ones in the case of predicting potential distribution for a weed not yet present in the United States. Considerations of model accuracy should also be balanced with those of reproducibility and ease of use. More important than the choice of modeling tool is the construction of robust protocols and testing both new and experienced users under blind test conditions that approximate operational conditions.
Universal Quake Statistics: From Compressed Nanocrystals to Earthquakes
Uhl, Jonathan T.; Pathak, Shivesh; Schorlemmer, Danijel; ...
2015-11-17
Slowly-compressed single crystals, bulk metallic glasses (BMGs), rocks, granular materials, and the earth all deform via intermittent slips or “quakes”. We find that although these systems span 12 decades in length scale, they all show the same scaling behavior for their slip size distributions and other statistical properties. Remarkably, the size distributions follow the same power law multiplied with the same exponential cutoff. The cutoff grows with applied force for materials spanning length scales from nanometers to kilometers. The tuneability of the cutoff with stress reflects “tuned critical” behavior, rather than self-organized criticality (SOC), which would imply stress-independence. A simplemore » mean field model for avalanches of slipping weak spots explains the agreement across scales. It predicts the observed slip-size distributions and the observed stressdependent cutoff function. In conclusion, the results enable extrapolations from one scale to another, and from one force to another, across different materials and structures, from nanocrystals to earthquakes.« less
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Northward shifts of the distributions of Spanish reptiles in association with climate change.
Moreno-Rueda, Gregorio; Pleguezuelos, Juan M; Pizarro, Manuel; Montori, Albert
2012-04-01
It is predicted that climate change will drive extinctions of some reptiles and that the number of these extinctions will depend on whether reptiles are able to change their distribution. Whether the latitudinal distribution of reptiles may change in response to increases in temperature is unknown. We used data on reptile distributions collected during the 20th century to analyze whether changes in the distributions of reptiles in Spain are associated with increases in temperature. We controlled for biases in sampling effort and found a mean, statistically significant, northward shift of the northern extent of reptile distributions of about 15.2 km from 1940-1975 to 1991-2005. The southern extent of the distributions did not change significantly. Thus, our results suggest that the latitudinal distributions of reptiles may be changing in response to climate change. ©2011 Society for Conservation Biology.
Real-Time Aircraft Engine-Life Monitoring
NASA Technical Reports Server (NTRS)
Klein, Richard
2014-01-01
This project developed an inservice life-monitoring system capable of predicting the remaining component and system life of aircraft engines. The embedded system provides real-time, inflight monitoring of the engine's thrust, exhaust gas temperature, efficiency, and the speed and time of operation. Based upon this data, the life-estimation algorithm calculates the remaining life of the engine components and uses this data to predict the remaining life of the engine. The calculations are based on the statistical life distribution of the engine components and their relationship to load, speed, temperature, and time.
NASA Astrophysics Data System (ADS)
Kim, H.-K.; Woo, J.-H.; Park, R. S.; Song, C. H.; Kim, J.-H.; Ban, S.-J.; Park, J.-H.
2013-09-01
Plant functional type (PFT) distributions affect the results of biogenic emission modeling as well as O3 and PM simulations using chemistry-transport models (CTMs). This paper analyzes the variations of both surface biogenic VOC emissions and O3 concentrations due to changes in the PFT distributions in the Seoul Metropolitan Areas, Korea. Also, this paper attempts to provide important implications for biogenic emissions modeling studies for CTM simulations. MM5-MEGAN-SMOKE-CMAQ model simulations were implemented over the Seoul Metropolitan Areas in Korea to predict surface O3 concentrations for the period of 1 May to 31 June 2008. Starting from MEGAN biogenic emissions analysis with three different sources of PFT input data, US EPA CMAQ O3 simulation results were evaluated by surface O3 monitoring datasets and further considered on the basis of geospatial and statistical analyses. The three PFT datasets considered were "(1)KORPFT", developed with a region specific vegetation database; (2) CDP, adopted from US NCAR; and (3) MODIS, reclassified from the NASA Terra and Aqua combined land cover products. Comparisons of MEGAN biogenic emission results with the three different PFT data showed that broadleaf trees (BT) are the most significant contributor, followed by needleleaf trees (NT), shrub (SB), and herbaceous plants (HB) to the total biogenic volatile organic compounds (BVOCs). In addition, isoprene from BT and terpene from NT were recognized as significant primary and secondary BVOC species in terms of BVOC emissions distributions and O3-forming potentials in the study domain. Multiple regression analyses with the different PFT data (δO3 vs. δPFTs) suggest that KORPFT can provide reasonable information to the framework of MEGAN biogenic emissions modeling and CTM O3 predictions. Analyses of the CMAQ performance statistics suggest that deviations of BT areas can significantly affect CMAQ isoprene and O3 predictions. From further evaluations of the isoprene and O3 prediction results, we explored the PFT area-loss artifact that occurs due to geographical disparity between the PFT and leaf area index distributions, and can cause increased bias in CMAQ O3. Thus, the PFT-loss artifact must be a source of limitation in the MEGAN biogenic emission modeling and the CTM O3 simulation results. Time changes of CMAQ O3 distributions with the different PFT scenarios suggest that hourly and local impacts from the different PFT distributions on occasional inter-deviations of O3 are quite noticeable, reaching up to 10 ppb. Exponentially diverging hourly BVOC emissions and O3 concentrations with increasing ambient temperature suggest that the use of representative PFT distributions becomes more critical for O3 air quality modeling (or forecasting) in support of air quality decision-making and human health studies.
NASA Astrophysics Data System (ADS)
Berthet, Lionel; Marty, Renaud; Bourgin, François; Viatgé, Julie; Piotte, Olivier; Perrin, Charles
2017-04-01
An increasing number of operational flood forecasting centres assess the predictive uncertainty associated with their forecasts and communicate it to the end users. This information can match the end-users needs (i.e. prove to be useful for an efficient crisis management) only if it is reliable: reliability is therefore a key quality for operational flood forecasts. In 2015, the French flood forecasting national and regional services (Vigicrues network; www.vigicrues.gouv.fr) implemented a framework to compute quantitative discharge and water level forecasts and to assess the predictive uncertainty. Among the possible technical options to achieve this goal, a statistical analysis of past forecasting errors of deterministic models has been selected (QUOIQUE method, Bourgin, 2014). It is a data-based and non-parametric approach based on as few assumptions as possible about the forecasting error mathematical structure. In particular, a very simple assumption is made regarding the predictive uncertainty distributions for large events outside the range of the calibration data: the multiplicative error distribution is assumed to be constant, whatever the magnitude of the flood. Indeed, the predictive distributions may not be reliable in extrapolation. However, estimating the predictive uncertainty for these rare events is crucial when major floods are of concern. In order to improve the forecasts reliability for major floods, an attempt at combining the operational strength of the empirical statistical analysis and a simple error modelling is done. Since the heteroscedasticity of forecast errors can considerably weaken the predictive reliability for large floods, this error modelling is based on the log-sinh transformation which proved to reduce significantly the heteroscedasticity of the transformed error in a simulation context, even for flood peaks (Wang et al., 2012). Exploratory tests on some operational forecasts issued during the recent floods experienced in France (major spring floods in June 2016 on the Loire river tributaries and flash floods in fall 2016) will be shown and discussed. References Bourgin, F. (2014). How to assess the predictive uncertainty in hydrological modelling? An exploratory work on a large sample of watersheds, AgroParisTech Wang, Q. J., Shrestha, D. L., Robertson, D. E. and Pokhrel, P (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, , W05514, doi:10.1029/2011WR010973
Lieske, David J; Lloyd, Vett K
2018-03-01
Ixodes scapularis, a known vector of Borrelia burgdorferi sensu stricto (Bbss), is undergoing range expansion in many parts of Canada. The province of New Brunswick, which borders jurisdictions with established populations of I. scapularis, constitutes a range expansion zone for this species. To better understand the current and potential future distribution of this tick under climate change projections, this study applied occupancy modelling to distributional records of adult ticks that successfully overwintered, obtained through passive surveillance. This study indicates that I. scapularis occurs throughout the southern-most portion of the province, in close proximity to coastlines and major waterways. Milder winter conditions, as indicated by the number of degree days <0 °C, was determined to be a strong predictor of tick occurrence, as was, to a lesser degree, rising levels of annual precipitation, leading to a final model with a predictive accuracy of 0.845 (range: 0.828-0.893). Both RCP 4.5 and RCP 8.5 climate projections predict that a significant proportion of the province (roughly a quarter to a third) will be highly suitable for I. scapularis by the 2080s. Comparison with cases of canine infection show good spatial agreement with baseline model predictions, but the presence of canine Borrelia infections beyond the climate envelope, defined by the highest probabilities of tick occurrence, suggest the presence of Bbss-carrying ticks distributed by long-range dispersal events. This research demonstrates that predictive statistical modelling of multi-year surveillance information is an efficient way to identify areas where I. scapularis is most likely to occur, and can be used to guide subsequent active sampling efforts in order to better understand fine scale species distributional patterns. Copyright © 2018 The Authors. Published by Elsevier GmbH.. All rights reserved.
The Unquiet State of Violent Relaxation
NASA Astrophysics Data System (ADS)
Henriksen, Richard
2005-08-01
In 1967 Lynden-Bell presented a statistical mechanical theory for the relaxation of collisionless systems. Since then this theory has been studied numerically and theoretically by many authors. Nakamura in 2000 gave an alternate theory that differed from that of Lynden- Bell by predicting a Gaussian equilibrium distribution function rather than Fermi-Dirac. More recently Henriksen in 2004 has used a coarsegraining technique on cosmological infall systems that also predicts a Gaussian equilibrium distribution function. These relaxed states are thought to occur from the centre of the system outwards. Simulations of cosmological cold dark-matter halos however persist in finding central density cusps (the NFWprofile), which are inconsistent with the predicted distribution functions and perhaps with the observations of some galaxies. Some numerical studies (e.g.Merrall & Henriksen 2003) that attempt to measure the distribution function of dark matter do find Gaussian functions, provided that the initial asymmetry is not too great. Moreover recent work at Queen's reported here by MacMillan, suggests that it is the growth of asymmetry during the infall that produces the cusped behaviour. So put briefly, the essential physics of dark-matter relaxation remains "obscure" as does the validity of the theoretical predictions. "Violent virialization" occurs rapidly, well before subscale relaxation, but the scale at which the relaxation stops (and why) remains unclear. I will present some results that argue for wave-particle relaxation (Landau damping as frequently suggested by Kandrup) and in addition I will suggest that the evolution of isolated systems is very different from that of systems constantly disturbed by infall. Isolated systems may become trapped in an unrelaxed state by the development or existence of multipolar internal structure. Nevertheless a suitable coarse graining of the system may restore the predicted distribution functions.
Simulating statistics of lightning-induced and man made fires
NASA Astrophysics Data System (ADS)
Krenn, R.; Hergarten, S.
2009-04-01
The frequency-area distributions of forest fires show power-law behavior with scaling exponents α in a quite narrow range, relating wildfire research to the theoretical framework of self-organized criticality. Examples of self-organized critical behavior can be found in computer simulations of simple cellular automata. The established self-organized critical Drossel-Schwabl forest fire model (DS-FFM) is one of the most widespread models in this context. Despite its qualitative agreement with event-size statistics from nature, its applicability is still questioned. Apart from general concerns that the DS-FFM apparently oversimplifies the complex nature of forest dynamics, it significantly overestimates the frequency of large fires. We present a straightforward modification of the model rules that increases the scaling exponent α by approximately 13 and brings the simulated event-size statistics close to those observed in nature. In addition, combined simulations of both the original and the modified model predict a dependence of the overall distribution on the ratio of lightning induced and man made fires as well as a difference between their respective event-size statistics. The increase of the scaling exponent with decreasing lightning probability as well as the splitting of the partial distributions are confirmed by the analysis of the Canadian Large Fire Database. As a consequence, lightning induced and man made forest fires cannot be treated separately in wildfire modeling, hazard assessment and forest management.
NASA Technical Reports Server (NTRS)
Herskovits, E. H.; Megalooikonomou, V.; Davatzikos, C.; Chen, A.; Bryan, R. N.; Gerring, J. P.
1999-01-01
PURPOSE: To determine whether there is an association between the spatial distribution of lesions detected at magnetic resonance (MR) imaging of the brain in children after closed-head injury and the development of secondary attention-deficit/hyperactivity disorder (ADHD). MATERIALS AND METHODS: Data obtained from 76 children without prior history of ADHD were analyzed. MR images were obtained 3 months after closed-head injury. After manual delineation of lesions, images were registered to the Talairach coordinate system. For each subject, registered images and secondary ADHD status were integrated into a brain-image database, which contains depiction (visualization) and statistical analysis software. Using this database, we assessed visually the spatial distributions of lesions and performed statistical analysis of image and clinical variables. RESULTS: Of the 76 children, 15 developed secondary ADHD. Depiction of the data suggested that children who developed secondary ADHD had more lesions in the right putamen than children who did not develop secondary ADHD; this impression was confirmed statistically. After Bonferroni correction, we could not demonstrate significant differences between secondary ADHD status and lesion burdens for the right caudate nucleus or the right globus pallidus. CONCLUSION: Closed-head injury-induced lesions in the right putamen in children are associated with subsequent development of secondary ADHD. Depiction software is useful in guiding statistical analysis of image data.
Comparison of in silico models for prediction of mutagenicity.
Bakhtyari, Nazanin G; Raitano, Giuseppa; Benfenati, Emilio; Martin, Todd; Young, Douglas
2013-01-01
Using a dataset with more than 6000 compounds, the performance of eight quantitative structure activity relationships (QSAR) models was evaluated: ACD/Tox Suite, Absorption, Distribution, Metabolism, Elimination, and Toxicity of chemical substances (ADMET) predictor, Derek, Toxicity Estimation Software Tool (T.E.S.T.), TOxicity Prediction by Komputer Assisted Technology (TOPKAT), Toxtree, CEASAR, and SARpy (SAR in python). In general, the results showed a high level of performance. To have a realistic estimate of the predictive ability, the results for chemicals inside and outside the training set for each model were considered. The effect of applicability domain tools (when available) on the prediction accuracy was also evaluated. The predictive tools included QSAR models, knowledge-based systems, and a combination of both methods. Models based on statistical QSAR methods gave better results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl
Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
"Submesoscale Soup" Vorticity and Tracer Statistics During the Lateral Mixing Experiment
NASA Astrophysics Data System (ADS)
Shcherbina, A.; D'Asaro, E. A.; Lee, C. M.; Molemaker, J.; McWilliams, J. C.
2012-12-01
A detailed view of upper-ocean velocity, vorticity, and tracer statistics was obtained by a unique synchronized two-vessel survey in the North Atlantic in winter 2012. In winter, North Atlantic Mode water region south of the Gulf Stream is filled with an energetic, homogeneous, and well-developed submesoscale turbulence field - the "submesoscale soup". Turbulence in the soup is produced by frontogenesis and the surface layer instability of mesoscale eddy flows in the vicinity of the Gulf Stream. This region is a convenient representation of the inertial range of the geophysical turbulence forward cascade spanning scales of o(1-100km). During the Lateral Mixing Experiment in February-March 2012, R/Vs Atlantis and Knorr were run on parallel tracks 1 km apart for 500 km in the submesoscale soup region. Synchronous ADCP sampling provided the first in-situ estimates of full 3-D vorticity and divergence without the usual mix of spatial and temporal aliasing. Tracer distributions were also simultaneously sampled by both vessels using the underway and towed instrumentation. Observed vorticity distribution in the mixed layer was markedly asymmetric, with sparse strands of strong anticyclonic vorticity embedded in a weak, predominantly cyclonic background. While the mean vorticity was close to zero, distribution skewness exceeded 2. These observations confirm theoretical and numerical model predictions for an active submesoscale turbulence field. Submesoscale vorticity spectra also agreed well with the model prediction.
IsoMAP (Isoscape Modeling, Analysis, and Prediction)
NASA Astrophysics Data System (ADS)
Miller, C. C.; Bowen, G. J.; Zhang, T.; Zhao, L.; West, J. B.; Liu, Z.; Rapolu, N.
2009-12-01
IsoMAP is a TeraGrid-based web portal aimed at building the infrastructure that brings together distributed multi-scale and multi-format geospatial datasets to enable statistical analysis and modeling of environmental isotopes. A typical workflow enabled by the portal includes (1) data source exploration and selection, (2) statistical analysis and model development; (3) predictive simulation of isotope distributions using models developed in (1) and (2); (4) analysis and interpretation of simulated spatial isotope distributions (e.g., comparison with independent observations, pattern analysis). The gridded models and data products created by one user can be shared and reused among users within the portal, enabling collaboration and knowledge transfer. This infrastructure and the research it fosters can lead to fundamental changes in our knowledge of the water cycle and ecological and biogeochemical processes through analysis of network-based isotope data, but it will be important A) that those with whom the data and models are shared can be sure of the origin, quality, inputs, and processing history of these products, and B) the system is agile and intuitive enough to facilitate this sharing (rather than just ‘allow’ it). IsoMAP researchers are therefore building into the portal’s architecture several components meant to increase the amount of metadata about users’ products and to repurpose those metadata to make sharing and discovery more intuitive and robust to both expected, professional users as well as unforeseeable populations from other sectors.
Learning from Massive Distributed Data Sets (Invited)
NASA Astrophysics Data System (ADS)
Kang, E. L.; Braverman, A. J.
2013-12-01
Technologies for remote sensing and ever-expanding computer experiments in climate science are generating massive data sets. Meanwhile, it has been common in all areas of large-scale science to have these 'big data' distributed over multiple different physical locations, and moving large amounts of data can be impractical. In this talk, we will discuss efficient ways for us to summarize and learn from distributed data. We formulate a graphical model to mimic the main characteristics of a distributed-data network, including the size of the data sets and speed of moving data. With this nominal model, we investigate the trade off between prediction accurate and cost of data movement, theoretically and through simulation experiments. We will also discuss new implementations of spatial and spatio-temporal statistical methods optimized for distributed data.
Non-gaussianity versus nonlinearity of cosmological perturbations.
Verde, L
2001-06-01
Following the discovery of the cosmic microwave background, the hot big-bang model has become the standard cosmological model. In this theory, small primordial fluctuations are subsequently amplified by gravity to form the large-scale structure seen today. Different theories for unified models of particle physics, lead to different predictions for the statistical properties of the primordial fluctuations, that can be divided in two classes: gaussian and non-gaussian. Convincing evidence against or for gaussian initial conditions would rule out many scenarios and point us toward a physical theory for the origin of structures. The statistical distribution of cosmological perturbations, as we observe them, can deviate from the gaussian distribution in several different ways. Even if perturbations start off gaussian, nonlinear gravitational evolution can introduce non-gaussian features. Additionally, our knowledge of the Universe comes principally from the study of luminous material such as galaxies, but galaxies might not be faithful tracers of the underlying mass distribution. The relationship between fluctuations in the mass and in the galaxies distribution (bias), is often assumed to be local, but could well be nonlinear. Moreover, galaxy catalogues use the redshift as third spatial coordinate: the resulting redshift-space map of the galaxy distribution is nonlinearly distorted by peculiar velocities. Nonlinear gravitational evolution, biasing, and redshift-space distortion introduce non-gaussianity, even in an initially gaussian fluctuation field. I investigate the statistical tools that allow us, in principle, to disentangle the above different effects, and the observational datasets we require to do so in practice.
Chapter two: Phenomenology of tsunamis II: scaling, event statistics, and inter-event triggering
Geist, Eric L.
2012-01-01
Observations related to tsunami catalogs are reviewed and described in a phenomenological framework. An examination of scaling relationships between earthquake size (as expressed by scalar seismic moment and mean slip) and tsunami size (as expressed by mean and maximum local run-up and maximum far-field amplitude) indicates that scaling is significant at the 95% confidence level, although there is uncertainty in how well earthquake size can predict tsunami size (R2 ~ 0.4-0.6). In examining tsunami event statistics, current methods used to estimate the size distribution of earthquakes and landslides and the inter-event time distribution of earthquakes are first reviewed. These methods are adapted to estimate the size and inter-event distribution of tsunamis at a particular recording station. Using a modified Pareto size distribution, the best-fit power-law exponents of tsunamis recorded at nine Pacific tide-gauge stations exhibit marked variation, in contrast to the approximately constant power-law exponent for inter-plate thrust earthquakes. With regard to the inter-event time distribution, significant temporal clustering of tsunami sources is demonstrated. For tsunami sources occurring in close proximity to other sources in both space and time, a physical triggering mechanism, such as static stress transfer, is a likely cause for the anomalous clustering. Mechanisms of earthquake-to-earthquake and earthquake-to-landslide triggering are reviewed. Finally, a modification of statistical branching models developed for earthquake triggering is introduced to describe triggering among tsunami sources.
Statistics of natural binaural sounds.
Młynarski, Wiktor; Jost, Jürgen
2014-01-01
Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.
Statistics of Natural Binaural Sounds
Młynarski, Wiktor; Jost, Jürgen
2014-01-01
Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction. PMID:25285658
Transfer Entropy as a Log-Likelihood Ratio
NASA Astrophysics Data System (ADS)
Barnett, Lionel; Bossomaier, Terry
2012-09-01
Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.
Transfer entropy as a log-likelihood ratio.
Barnett, Lionel; Bossomaier, Terry
2012-09-28
Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology, and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic χ2 distribution is established for the transfer entropy estimator. The result generalizes the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense.
Statistical Anomalies of Bitflips in SRAMs to Discriminate SBUs From MCUs
NASA Astrophysics Data System (ADS)
Clemente, Juan Antonio; Franco, Francisco J.; Villa, Francesca; Baylac, Maud; Rey, Solenne; Mecha, Hortensia; Agapito, Juan A.; Puchner, Helmut; Hubert, Guillaume; Velazco, Raoul
2016-08-01
Recently, the occurrence of multiple events in static tests has been investigated by checking the statistical distribution of the difference between the addresses of the words containing bitflips. That method has been successfully applied to Field Programmable Gate Arrays (FPGAs) and the original authors indicate that it is also valid for SRAMs. This paper presents a modified methodology that is based on checking the XORed addresses with bitflips, rather than on the difference. Irradiation tests on CMOS 130 & 90 nm SRAMs with 14-MeV neutrons have been performed to validate this methodology. Results in high-altitude environments are also presented and cross-checked with theoretical predictions. In addition, this methodology has also been used to detect modifications in the organization of said memories. Theoretical predictions have been validated with actual data provided by the manufacturer.
How much to trust the senses: Likelihood learning
Sato, Yoshiyuki; Kording, Konrad P.
2014-01-01
Our brain often needs to estimate unknown variables from imperfect information. Our knowledge about the statistical distributions of quantities in our environment (called priors) and currently available information from sensory inputs (called likelihood) are the basis of all Bayesian models of perception and action. While we know that priors are learned, most studies of prior-likelihood integration simply assume that subjects know about the likelihood. However, as the quality of sensory inputs change over time, we also need to learn about new likelihoods. Here, we show that human subjects readily learn the distribution of visual cues (likelihood function) in a way that can be predicted by models of statistically optimal learning. Using a likelihood that depended on color context, we found that a learned likelihood generalized to new priors. Thus, we conclude that subjects learn about likelihood. PMID:25398975
Statistical interpretation of transient current power-law decay in colloidal quantum dot arrays
NASA Astrophysics Data System (ADS)
Sibatov, R. T.
2011-08-01
A new statistical model of the charge transport in colloidal quantum dot arrays is proposed. It takes into account Coulomb blockade forbidding multiple occupancy of nanocrystals and the influence of energetic disorder of interdot space. The model explains power-law current transients and the presence of the memory effect. The fractional differential analogue of the Ohm law is found phenomenologically for nanocrystal arrays. The model combines ideas that were considered as conflicting by other authors: the Scher-Montroll idea about the power-law distribution of waiting times in localized states for disordered semiconductors is applied taking into account Coulomb blockade; Novikov's condition about the asymptotic power-law distribution of time intervals between successful current pulses in conduction channels is fulfilled; and the carrier injection blocking predicted by Ginger and Greenham (2000 J. Appl. Phys. 87 1361) takes place.
Spatial modelling of disease using data- and knowledge-driven approaches.
Stevens, Kim B; Pfeiffer, Dirk U
2011-09-01
The purpose of spatial modelling in animal and public health is three-fold: describing existing spatial patterns of risk, attempting to understand the biological mechanisms that lead to disease occurrence and predicting what will happen in the medium to long-term future (temporal prediction) or in different geographical areas (spatial prediction). Traditional methods for temporal and spatial predictions include general and generalized linear models (GLM), generalized additive models (GAM) and Bayesian estimation methods. However, such models require both disease presence and absence data which are not always easy to obtain. Novel spatial modelling methods such as maximum entropy (MAXENT) and the genetic algorithm for rule set production (GARP) require only disease presence data and have been used extensively in the fields of ecology and conservation, to model species distribution and habitat suitability. Other methods, such as multicriteria decision analysis (MCDA), use knowledge of the causal factors of disease occurrence to identify areas potentially suitable for disease. In addition to their less restrictive data requirements, some of these novel methods have been shown to outperform traditional statistical methods in predictive ability (Elith et al., 2006). This review paper provides details of some of these novel methods for mapping disease distribution, highlights their advantages and limitations, and identifies studies which have used the methods to model various aspects of disease distribution. Copyright © 2011. Published by Elsevier Ltd.
Cosmological velocity correlations - Observations and model predictions
NASA Technical Reports Server (NTRS)
Gorski, Krzysztof M.; Davis, Marc; Strauss, Michael A.; White, Simon D. M.; Yahil, Amos
1989-01-01
By applying the present simple statistics for two-point cosmological peculiar velocity-correlation measurements to the actual data sets of the Local Supercluster spiral galaxy of Aaronson et al. (1982) and the elliptical galaxy sample of Burstein et al. (1987), as well as to the velocity field predicted by the distribution of IRAS galaxies, a coherence length of 1100-1600 km/sec is obtained. Coherence length is defined as that separation at which the correlations drop to half their zero-lag value. These results are compared with predictions from two models of large-scale structure formation: that of cold dark matter and that of baryon isocurvature proposed by Peebles (1980). N-body simulations of these models are performed to check the linear theory predictions and measure sampling fluctuations.
PGT: A Statistical Approach to Prediction and Mechanism Design
NASA Astrophysics Data System (ADS)
Wolpert, David H.; Bono, James W.
One of the biggest challenges facing behavioral economics is the lack of a single theoretical framework that is capable of directly utilizing all types of behavioral data. One of the biggest challenges of game theory is the lack of a framework for making predictions and designing markets in a manner that is consistent with the axioms of decision theory. An approach in which solution concepts are distribution-valued rather than set-valued (i.e. equilibrium theory) has both capabilities. We call this approach Predictive Game Theory (or PGT). This paper outlines a general Bayesian approach to PGT. It also presents one simple example to illustrate the way in which this approach differs from equilibrium approaches in both prediction and mechanism design settings.
A log-sinh transformation for data normalization and variance stabilization
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Shrestha, D. L.; Robertson, D. E.; Pokhrel, P.
2012-05-01
When quantifying model prediction uncertainty, it is statistically convenient to represent model errors that are normally distributed with a constant variance. The Box-Cox transformation is the most widely used technique to normalize data and stabilize variance, but it is not without limitations. In this paper, a log-sinh transformation is derived based on a pattern of errors commonly seen in hydrological model predictions. It is suited to applications where prediction variables are positively skewed and the spread of errors is seen to first increase rapidly, then slowly, and eventually approach a constant as the prediction variable becomes greater. The log-sinh transformation is applied in two case studies, and the results are compared with one- and two-parameter Box-Cox transformations.
Statistical thermodynamics of clustered populations.
Matsoukas, Themis
2014-08-01
We present a thermodynamic theory for a generic population of M individuals distributed into N groups (clusters). We construct the ensemble of all distributions with fixed M and N, introduce a selection functional that embodies the physics that governs the population, and obtain the distribution that emerges in the scaling limit as the most probable among all distributions consistent with the given physics. We develop the thermodynamics of the ensemble and establish a rigorous mapping to regular thermodynamics. We treat the emergence of a so-called giant component as a formal phase transition and show that the criteria for its emergence are entirely analogous to the equilibrium conditions in molecular systems. We demonstrate the theory by an analytic model and confirm the predictions by Monte Carlo simulation.
Spatial epidemiology of bovine tuberculosis in Mexico.
Martínez, Horacio Zendejas; Suazo, Feliciano Milián; Cuador Gil, José Quintín; Bello, Gustavo Cruz; Anaya Escalera, Ana María; Márquez, Gabriel Huitrón; Casanova, Leticia García
2007-01-01
The purpose of this study was to use geographic information systems (GIS) and geo-statistical methods of ordinary kriging to predict the prevalence and distribution of bovine tuberculosis (TB) in Jalisco, Mexico. A random sample of 2 287 herds selected from a set of 48 766 was used for the analysis. Spatial location of herds was obtained by either a personal global positioning system (GPS), a database from the Instituto Nacional de Estadìstica Geografìa e Informàtica (INEGI) or Google Earth. Information on TB prevalence was provided by the Jalisco Commission for the Control and Eradication of Tuberculosis (COEETB). Prediction of TB was obtained using ordinary kriging in the geostatistical analyst module in ArcView8. A predicted high prevalence area of TB matching the distribution of dairy cattle was observed. This prediction was in agreement with the prevalence calculated on the total 48 766 herds. Validation was performed taking estimated values of TB prevalence at each municipality, extracted from the kriging surface and then compared with the real prevalence values using a correlation test, giving a value of 0.78, indicating that GIS and kriging are reliable tools for the estimation of TB distribution based on a random sample. This resulted in a significant savings of resources.
Statistical ecology comes of age.
Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-12-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.
Statistical ecology comes of age
Gimenez, Olivier; Buckland, Stephen T.; Morgan, Byron J. T.; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M.; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M.; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-01-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1–4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data. PMID:25540151
Modality-Driven Classification and Visualization of Ensemble Variance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bensema, Kevin; Gosink, Luke; Obermaier, Harald
Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no informationmore » about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.
The additivity model assumed that field-scale reaction properties in a sediment including surface area, reactive site concentration, and reaction rate can be predicted from field-scale grain-size distribution by linearly adding reaction properties estimated in laboratory for individual grain-size fractions. This study evaluated the additivity model in scaling mass transfer-limited, multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment. Experimental data of rate-limited U(VI) desorption in a stirred flow-cell reactor were used to estimate the statistical properties of the rate constants for individual grain-size fractions, which were then used to predict rate-limited U(VI) desorption in the composite sediment. The resultmore » indicated that the additivity model with respect to the rate of U(VI) desorption provided a good prediction of U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel-size fraction (2 to 8 mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less
Lord, Dominique; Guikema, Seth D; Geedipally, Srinivas Reddy
2008-05-01
This paper documents the application of the Conway-Maxwell-Poisson (COM-Poisson) generalized linear model (GLM) for modeling motor vehicle crashes. The COM-Poisson distribution, originally developed in 1962, has recently been re-introduced by statisticians for analyzing count data subjected to over- and under-dispersion. This innovative distribution is an extension of the Poisson distribution. The objectives of this study were to evaluate the application of the COM-Poisson GLM for analyzing motor vehicle crashes and compare the results with the traditional negative binomial (NB) model. The comparison analysis was carried out using the most common functional forms employed by transportation safety analysts, which link crashes to the entering flows at intersections or on segments. To accomplish the objectives of the study, several NB and COM-Poisson GLMs were developed and compared using two datasets. The first dataset contained crash data collected at signalized four-legged intersections in Toronto, Ont. The second dataset included data collected for rural four-lane divided and undivided highways in Texas. Several methods were used to assess the statistical fit and predictive performance of the models. The results of this study show that COM-Poisson GLMs perform as well as NB models in terms of GOF statistics and predictive performance. Given the fact the COM-Poisson distribution can also handle under-dispersed data (while the NB distribution cannot or has difficulties converging), which have sometimes been observed in crash databases, the COM-Poisson GLM offers a better alternative over the NB model for modeling motor vehicle crashes, especially given the important limitations recently documented in the safety literature about the latter type of model.
ERIC Educational Resources Information Center
Geoghegan, William H.
This paper demonstrates, with reference to the Samal of Tagtabon Island, how the structure of local domestic groups and the statistical distribution of residence types can be derived from a detailed description of the decision rules used by Samal native actors themselves in making, evaluating and predicting residence choices. Considering the…
NASA Astrophysics Data System (ADS)
Schwartz, M. Christian
2017-08-01
This paper addresses two straightforward questions. First, how similar are the statistics of cirrus particle size distribution (PSD) datasets collected using the Two-Dimensional Stereo (2D-S) probe to cirrus PSD datasets collected using older Particle Measuring Systems (PMS) 2-D Cloud (2DC) and 2-D Precipitation (2DP) probes? Second, how similar are the datasets when shatter-correcting post-processing is applied to the 2DC datasets? To answer these questions, a database of measured and parameterized cirrus PSDs - constructed from measurements taken during the Small Particles in Cirrus (SPARTICUS); Mid-latitude Airborne Cirrus Properties Experiment (MACPEX); and Tropical Composition, Cloud, and Climate Coupling (TC4) flight campaigns - is used.Bulk cloud quantities are computed from the 2D-S database in three ways: first, directly from the 2D-S data; second, by applying the 2D-S data to ice PSD parameterizations developed using sets of cirrus measurements collected using the older PMS probes; and third, by applying the 2D-S data to a similar parameterization developed using the 2D-S data themselves. This is done so that measurements of the same cloud volumes by parameterized versions of the 2DC and 2D-S can be compared with one another. It is thereby seen - given the same cloud field and given the same assumptions concerning ice crystal cross-sectional area, density, and radar cross section - that the parameterized 2D-S and the parameterized 2DC predict similar distributions of inferred shortwave extinction coefficient, ice water content, and 94 GHz radar reflectivity. However, the parameterization of the 2DC based on uncorrected data predicts a statistically significantly higher number of total ice crystals and a larger ratio of small ice crystals to large ice crystals than does the parameterized 2D-S. The 2DC parameterization based on shatter-corrected data also predicts statistically different numbers of ice crystals than does the parameterized 2D-S, but the comparison between the two is nevertheless more favorable. It is concluded that the older datasets continue to be useful for scientific purposes, with certain caveats, and that continuing field investigations of cirrus with more modern probes is desirable.
Intermediate quantum maps for quantum computation
NASA Astrophysics Data System (ADS)
Giraud, O.; Georgeot, B.
2005-10-01
We study quantum maps displaying spectral statistics intermediate between Poisson and Wigner-Dyson. It is shown that they can be simulated on a quantum computer with a small number of gates, and efficiently yield information about fidelity decay or spectral statistics. We study their matrix elements and entanglement production and show that they converge with time to distributions which differ from random matrix predictions. A randomized version of these maps can be implemented even more economically and yields pseudorandom operators with original properties, enabling, for example, one to produce fractal random vectors. These algorithms are within reach of present-day quantum computers.
An application of synthetic seismicity in earthquake statistics - The Middle America Trench
NASA Technical Reports Server (NTRS)
Ward, Steven N.
1992-01-01
The way in which seismicity calculations which are based on the concept of fault segmentation incorporate the physics of faulting through static dislocation theory can improve earthquake recurrence statistics and hone the probabilities of hazard is shown. For the Middle America Trench, the spread parameters of the best-fitting lognormal or Weibull distributions (about 0.75) are much larger than the 0.21 intrinsic spread proposed in the Nishenko Buland (1987) hypothesis. Stress interaction between fault segments disrupts time or slip predictability and causes earthquake recurrence to be far more aperiodic than has been suggested.
NASA Astrophysics Data System (ADS)
Jia, Huizhen; Sun, Quansen; Ji, Zexuan; Wang, Tonghan; Chen, Qiang
2014-11-01
The goal of no-reference/blind image quality assessment (NR-IQA) is to devise a perceptual model that can accurately predict the quality of a distorted image as human opinions, in which feature extraction is an important issue. However, the features used in the state-of-the-art "general purpose" NR-IQA algorithms are usually natural scene statistics (NSS) based or are perceptually relevant; therefore, the performance of these models is limited. To further improve the performance of NR-IQA, we propose a general purpose NR-IQA algorithm which combines NSS-based features with perceptually relevant features. The new method extracts features in both the spatial and gradient domains. In the spatial domain, we extract the point-wise statistics for single pixel values which are characterized by a generalized Gaussian distribution model to form the underlying features. In the gradient domain, statistical features based on neighboring gradient magnitude similarity are extracted. Then a mapping is learned to predict quality scores using a support vector regression. The experimental results on the benchmark image databases demonstrate that the proposed algorithm correlates highly with human judgments of quality and leads to significant performance improvements over state-of-the-art methods.
Testing statistical self-similarity in the topology of river networks
Troutman, Brent M.; Mantilla, Ricardo; Gupta, Vijay K.
2010-01-01
Recent work has demonstrated that the topological properties of real river networks deviate significantly from predictions of Shreve's random model. At the same time the property of mean self-similarity postulated by Tokunaga's model is well supported by data. Recently, a new class of network model called random self-similar networks (RSN) that combines self-similarity and randomness has been introduced to replicate important topological features observed in real river networks. We investigate if the hypothesis of statistical self-similarity in the RSN model is supported by data on a set of 30 basins located across the continental United States that encompass a wide range of hydroclimatic variability. We demonstrate that the generators of the RSN model obey a geometric distribution, and self-similarity holds in a statistical sense in 26 of these 30 basins. The parameters describing the distribution of interior and exterior generators are tested to be statistically different and the difference is shown to produce the well-known Hack's law. The inter-basin variability of RSN parameters is found to be statistically significant. We also test generator dependence on two climatic indices, mean annual precipitation and radiative index of dryness. Some indication of climatic influence on the generators is detected, but this influence is not statistically significant with the sample size available. Finally, two key applications of the RSN model to hydrology and geomorphology are briefly discussed.
Linear and non-linear bias: predictions versus measurements
NASA Astrophysics Data System (ADS)
Hoffmann, K.; Bel, J.; Gaztañaga, E.
2017-02-01
We study the linear and non-linear bias parameters which determine the mapping between the distributions of galaxies and the full matter density fields, comparing different measurements and predictions. Associating galaxies with dark matter haloes in the Marenostrum Institut de Ciències de l'Espai (MICE) Grand Challenge N-body simulation, we directly measure the bias parameters by comparing the smoothed density fluctuations of haloes and matter in the same region at different positions as a function of smoothing scale. Alternatively, we measure the bias parameters by matching the probability distributions of halo and matter density fluctuations, which can be applied to observations. These direct bias measurements are compared to corresponding measurements from two-point and different third-order correlations, as well as predictions from the peak-background model, which we presented in previous papers using the same data. We find an overall variation of the linear bias measurements and predictions of ˜5 per cent with respect to results from two-point correlations for different halo samples with masses between ˜1012and1015 h-1 M⊙ at the redshifts z = 0.0 and 0.5. Variations between the second- and third-order bias parameters from the different methods show larger variations, but with consistent trends in mass and redshift. The various bias measurements reveal a tight relation between the linear and the quadratic bias parameters, which is consistent with results from the literature based on simulations with different cosmologies. Such a universal relation might improve constraints on cosmological models, derived from second-order clustering statistics at small scales or higher order clustering statistics.
Mohamed, Moumouni Guero; Fan, Xuejun; Zhang, Guoqi; Pecht, Michael
2017-01-01
With the expanding application of light-emitting diodes (LEDs), the color quality of white LEDs has attracted much attention in several color-sensitive application fields, such as museum lighting, healthcare lighting and displays. Reliability concerns for white LEDs are changing from the luminous efficiency to color quality. However, most of the current available research on the reliability of LEDs is still focused on luminous flux depreciation rather than color shift failure. The spectral power distribution (SPD), defined as the radiant power distribution emitted by a light source at a range of visible wavelength, contains the most fundamental luminescence mechanisms of a light source. SPD is used as the quantitative inference of an LED’s optical characteristics, including color coordinates that are widely used to represent the color shift process. Thus, to model the color shift failure of white LEDs during aging, this paper first extracts the features of an SPD, representing the characteristics of blue LED chips and phosphors, by multi-peak curve-fitting and modeling them with statistical functions. Then, because the shift processes of extracted features in aged LEDs are always nonlinear, a nonlinear state-space model is then developed to predict the color shift failure time within a self-adaptive particle filter framework. The results show that: (1) the failure mechanisms of LEDs can be identified by analyzing the extracted features of SPD with statistical curve-fitting and (2) the developed method can dynamically and accurately predict the color coordinates, correlated color temperatures (CCTs), and color rendering indexes (CRIs) of phosphor-converted (pc)-white LEDs, and also can estimate the residual color life. PMID:28773176
Fan, Jiajie; Mohamed, Moumouni Guero; Qian, Cheng; Fan, Xuejun; Zhang, Guoqi; Pecht, Michael
2017-07-18
With the expanding application of light-emitting diodes (LEDs), the color quality of white LEDs has attracted much attention in several color-sensitive application fields, such as museum lighting, healthcare lighting and displays. Reliability concerns for white LEDs are changing from the luminous efficiency to color quality. However, most of the current available research on the reliability of LEDs is still focused on luminous flux depreciation rather than color shift failure. The spectral power distribution (SPD), defined as the radiant power distribution emitted by a light source at a range of visible wavelength, contains the most fundamental luminescence mechanisms of a light source. SPD is used as the quantitative inference of an LED's optical characteristics, including color coordinates that are widely used to represent the color shift process. Thus, to model the color shift failure of white LEDs during aging, this paper first extracts the features of an SPD, representing the characteristics of blue LED chips and phosphors, by multi-peak curve-fitting and modeling them with statistical functions. Then, because the shift processes of extracted features in aged LEDs are always nonlinear, a nonlinear state-space model is then developed to predict the color shift failure time within a self-adaptive particle filter framework. The results show that: (1) the failure mechanisms of LEDs can be identified by analyzing the extracted features of SPD with statistical curve-fitting and (2) the developed method can dynamically and accurately predict the color coordinates, correlated color temperatures (CCTs), and color rendering indexes (CRIs) of phosphor-converted (pc)-white LEDs, and also can estimate the residual color life.
Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan
2010-09-01
Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
Identifying the location of fire refuges in wet forest ecosystems.
Berry, Laurence E; Driscoll, Don A; Stein, John A; Blanchard, Wade; Banks, Sam C; Bradstock, Ross A; Lindenmayer, David B
2015-12-01
The increasing frequency of large, high-severity fires threatens the survival of old-growth specialist fauna in fire-prone forests. Within topographically diverse montane forests, areas that experience less severe or fewer fires compared with those prevailing in the landscape may present unique resource opportunities enabling old-growth specialist fauna to survive. Statistical landscape models that identify the extent and distribution of potential fire refuges may assist land managers to incorporate these areas into relevant biodiversity conservation strategies. We used a case study in an Australian wet montane forest to establish how predictive fire simulation models can be interpreted as management tools to identify potential fire refuges. We examined the relationship between the probability of fire refuge occurrence as predicted by an existing fire refuge model and fire severity experienced during a large wildfire. We also examined the extent to which local fire severity was influenced by fire severity in the surrounding landscape. We used a combination of statistical approaches, including generalized linear modeling, variogram analysis, and receiver operating characteristics and area under the curve analysis (ROC AUC). We found that the amount of unburned habitat and the factors influencing the retention and location of fire refuges varied with fire conditions. Under extreme fire conditions, the distribution of fire refuges was limited to only extremely sheltered, fire-resistant regions of the landscape. During extreme fire conditions, fire severity patterns were largely determined by stochastic factors that could not be predicted by the model. When fire conditions were moderate, physical landscape properties appeared to mediate fire severity distribution. Our study demonstrates that land managers can employ predictive landscape fire models to identify the broader climatic and spatial domain within which fire refuges are likely to be present. It is essential that within these envelopes, forest is protected from logging, roads, and other developments so that the ecological processes related to the establishment and subsequent use of fire refuges are maintained.
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
New statistical scission-point model to predict fission fragment observables
NASA Astrophysics Data System (ADS)
Lemaître, Jean-François; Panebianco, Stefano; Sida, Jean-Luc; Hilaire, Stéphane; Heinrich, Sophie
2015-09-01
The development of high performance computing facilities makes possible a massive production of nuclear data in a full microscopic framework. Taking advantage of the individual potential calculations of more than 7000 nuclei, a new statistical scission-point model, called SPY, has been developed. It gives access to the absolute available energy at the scission point, which allows the use of a parameter-free microcanonical statistical description to calculate the distributions and the mean values of all fission observables. SPY uses the richness of microscopy in a rather simple theoretical framework, without any parameter except the scission-point definition, to draw clear answers based on perfect knowledge of the ingredients involved in the model, with very limited computing cost.
NASA Astrophysics Data System (ADS)
Aaboud, M.; Aad, G.; Abbott, B.; Abdinov, O.; Abeloos, B.; Abidi, S. H.; AbouZeid, O. S.; Abraham, N. L.; Abramowicz, H.; Abreu, H.; Abreu, R.; Abulaiti, Y.; Acharya, B. S.; Adachi, S.; Adamczyk, L.; Adelman, J.; Adersberger, M.; Adye, T.; Affolder, A. A.; Afik, Y.; Agatonovic-Jovin, T.; Agheorghiesei, C.; Aguilar-Saavedra, J. A.; Ahlen, S. P.; Ahmadov, F.; Aielli, G.; Akatsuka, S.; Akerstedt, H.; Åkesson, T. P. A.; Akilli, E.; Akimov, A. V.; Alberghi, G. L.; Albert, J.; Albicocco, P.; Alconada Verzini, M. J.; Alderweireldt, S. C.; Aleksa, M.; Aleksandrov, I. N.; Alexa, C.; Alexander, G.; Alexopoulos, T.; Alhroob, M.; Ali, B.; Aliev, M.; Alimonti, G.; Alison, J.; Alkire, S. P.; Allbrooke, B. M. M.; Allen, B. W.; Allport, P. P.; Aloisio, A.; Alonso, A.; Alonso, F.; Alpigiani, C.; Alshehri, A. A.; Alstaty, M. I.; Alvarez Gonzalez, B.; Álvarez Piqueras, D.; Alviggi, M. G.; Amadio, B. T.; Amaral Coutinho, Y.; Amelung, C.; Amidei, D.; Amor Dos Santos, S. P.; Amoroso, S.; Amundsen, G.; Anastopoulos, C.; Ancu, L. S.; Andari, N.; Andeen, T.; Anders, C. F.; Anders, J. K.; Anderson, K. J.; Andreazza, A.; Andrei, V.; Angelidakis, S.; Angelozzi, I.; Angerami, A.; Anisenkov, A. V.; Anjos, N.; Annovi, A.; Antel, C.; Antonelli, M.; Antonov, A.; Antrim, D. J.; Anulli, F.; Aoki, M.; Aperio Bella, L.; Arabidze, G.; Arai, Y.; Araque, J. P.; Araujo Ferraz, V.; Arce, A. T. H.; Ardell, R. E.; Arduh, F. A.; Arguin, J.-F.; Argyropoulos, S.; Arik, M.; Armbruster, A. J.; Armitage, L. J.; Arnaez, O.; Arnold, H.; Arratia, M.; Arslan, O.; Artamonov, A.; Artoni, G.; Artz, S.; Asai, S.; Asbah, N.; Ashkenazi, A.; Asquith, L.; Assamagan, K.; Astalos, R.; Atkinson, M.; Atlay, N. B.; Augsten, K.; Avolio, G.; Axen, B.; Ayoub, M. K.; Azuelos, G.; Baas, A. E.; Baca, M. J.; Bachacou, H.; Bachas, K.; Backes, M.; Bagnaia, P.; Bahmani, M.; Bahrasemani, H.; Baines, J. T.; Bajic, M.; Baker, O. K.; Bakker, P. J.; Baldin, E. M.; Balek, P.; Balli, F.; Balunas, W. K.; Banas, E.; Bandyopadhyay, A.; Banerjee, Sw.; Bannoura, A. A. E.; Barak, L.; Barberio, E. L.; Barberis, D.; Barbero, M.; Barillari, T.; Barisits, M.-S.; Barkeloo, J. T.; Barklow, T.; Barlow, N.; Barnes, S. L.; Barnett, B. M.; Barnett, R. M.; Barnovska-Blenessy, Z.; Baroncelli, A.; Barone, G.; Barr, A. J.; Barranco Navarro, L.; Barreiro, F.; Barreiro Guimarães da Costa, J.; Bartoldus, R.; Barton, A. E.; Bartos, P.; Basalaev, A.; Bassalat, A.; Bates, R. L.; Batista, S. J.; Batley, J. R.; Battaglia, M.; Bauce, M.; Bauer, F.; Bawa, H. S.; Beacham, J. B.; Beattie, M. D.; Beau, T.; Beauchemin, P. H.; Bechtle, P.; Beck, H. P.; Beck, H. C.; Becker, K.; Becker, M.; Becot, C.; Beddall, A. J.; Beddall, A.; Bednyakov, V. A.; Bedognetti, M.; Bee, C. P.; Beermann, T. A.; Begalli, M.; Begel, M.; Behr, J. K.; Bell, A. S.; Bella, G.; Bellagamba, L.; Bellerive, A.; Bellomo, M.; Belotskiy, K.; Beltramello, O.; Belyaev, N. L.; Benary, O.; Benchekroun, D.; Bender, M.; Benekos, N.; Benhammou, Y.; Benhar Noccioli, E.; Benitez, J.; Benjamin, D. P.; Benoit, M.; Bensinger, J. R.; Bentvelsen, S.; Beresford, L.; Beretta, M.; Berge, D.; Bergeaas Kuutmann, E.; Berger, N.; Beringer, J.; Berlendis, S.; Bernard, N. R.; Bernardi, G.; Bernius, C.; Bernlochner, F. U.; Berry, T.; Berta, P.; Bertella, C.; Bertoli, G.; Bertram, I. A.; Bertsche, C.; Bertsche, D.; Besjes, G. J.; Bessidskaia Bylund, O.; Bessner, M.; Besson, N.; Bethani, A.; Bethke, S.; Betti, A.; Bevan, A. J.; Beyer, J.; Bianchi, R. M.; Biebel, O.; Biedermann, D.; Bielski, R.; Bierwagen, K.; Biesuz, N. V.; Biglietti, M.; Billoud, T. R. V.; Bilokon, H.; Bindi, M.; Bingul, A.; Bini, C.; Biondi, S.; Bisanz, T.; Bittrich, C.; Bjergaard, D. M.; Black, J. E.; Black, K. M.; Blair, R. E.; Blazek, T.; Bloch, I.; Blocker, C.; Blue, A.; Blumenschein, U.; Blunier, S.; Bobbink, G. J.; Bobrovnikov, V. S.; Bocchetta, S. S.; Bocci, A.; Bock, C.; Boehler, M.; Boerner, D.; Bogavac, D.; Bogdanchikov, A. G.; Bohm, C.; Boisvert, V.; Bokan, P.; Bold, T.; Boldyrev, A. S.; Bolz, A. E.; Bomben, M.; Bona, M.; Boonekamp, M.; Borisov, A.; Borissov, G.; Bortfeldt, J.; Bortoletto, D.; Bortolotto, V.; Boscherini, D.; Bosman, M.; Bossio Sola, J. D.; Boudreau, J.; Bouhova-Thacker, E. V.; Boumediene, D.; Bourdarios, C.; Boutle, S. K.; Boveia, A.; Boyd, J.; Boyko, I. R.; Bozson, A. J.; Bracinik, J.; Brandt, A.; Brandt, G.; Brandt, O.; Braren, F.; Bratzler, U.; Brau, B.; Brau, J. E.; Breaden Madden, W. D.; Brendlinger, K.; Brennan, A. J.; Brenner, L.; Brenner, R.; Bressler, S.; Briglin, D. L.; Bristow, T. M.; Britton, D.; Britzger, D.; Brochu, F. M.; Brock, I.; Brock, R.; Brooijmans, G.; Brooks, T.; Brooks, W. K.; Brosamer, J.; Brost, E.; Broughton, J. H.; Bruckman de Renstrom, P. A.; Bruncko, D.; Bruni, A.; Bruni, G.; Bruni, L. S.; Bruno, S.; Brunt, BH; Bruschi, M.; Bruscino, N.; Bryant, P.; Bryngemark, L.; Buanes, T.; Buat, Q.; Buchholz, P.; Buckley, A. G.; Budagov, I. A.; Buehrer, F.; Bugge, M. K.; Bulekov, O.; Bullock, D.; Burch, T. J.; Burdin, S.; Burgard, C. D.; Burger, A. M.; Burghgrave, B.; Burka, K.; Burke, S.; Burmeister, I.; Burr, J. T. P.; Büscher, D.; Büscher, V.; Bussey, P.; Butler, J. M.; Buttar, C. M.; Butterworth, J. M.; Butti, P.; Buttinger, W.; Buzatu, A.; Buzykaev, A. R.; Cabrera Urbán, S.; Caforio, D.; Cai, H.; Cairo, V. M.; Cakir, O.; Calace, N.; Calafiura, P.; Calandri, A.; Calderini, G.; Calfayan, P.; Callea, G.; Caloba, L. P.; Calvente Lopez, S.; Calvet, D.; Calvet, S.; Calvet, T. P.; Camacho Toro, R.; Camarda, S.; Camarri, P.; Cameron, D.; Caminal Armadans, R.; Camincher, C.; Campana, S.; Campanelli, M.; Camplani, A.; Campoverde, A.; Canale, V.; Cano Bret, M.; Cantero, J.; Cao, T.; Capeans Garrido, M. D. M.; Caprini, I.; Caprini, M.; Capua, M.; Carbone, R. M.; Cardarelli, R.; Cardillo, F.; Carli, I.; Carli, T.; Carlino, G.; Carlson, B. T.; Carminati, L.; Carney, R. M. D.; Caron, S.; Carquin, E.; Carrá, S.; Carrillo-Montoya, G. D.; Casadei, D.; Casado, M. P.; Casha, A. F.; Casolino, M.; Casper, D. W.; Castelijn, R.; Castillo Gimenez, V.; Castro, N. F.; Catinaccio, A.; Catmore, J. R.; Cattai, A.; Caudron, J.; Cavaliere, V.; Cavallaro, E.; Cavalli, D.; Cavalli-Sforza, M.; Cavasinni, V.; Celebi, E.; Ceradini, F.; Cerda Alberich, L.; Cerqueira, A. S.; Cerri, A.; Cerrito, L.; Cerutti, F.; Cervelli, A.; Cetin, S. A.; Chafaq, A.; Chakraborty, D.; Chan, S. K.; Chan, W. S.; Chan, Y. L.; Chang, P.; Chapman, J. D.; Charlton, D. G.; Chau, C. C.; Chavez Barajas, C. A.; Che, S.; Cheatham, S.; Chegwidden, A.; Chekanov, S.; Chekulaev, S. V.; Chelkov, G. A.; Chelstowska, M. A.; Chen, C.; Chen, C.; Chen, H.; Chen, J.; Chen, S.; Chen, S.; Chen, X.; Chen, Y.; Cheng, H. C.; Cheng, H. J.; Cheplakov, A.; Cheremushkina, E.; Cherkaoui El Moursli, R.; Cheu, E.; Cheung, K.; Chevalier, L.; Chiarella, V.; Chiarelli, G.; Chiodini, G.; Chisholm, A. S.; Chitan, A.; Chiu, Y. H.; Chizhov, M. V.; Choi, K.; Chomont, A. R.; Chouridou, S.; Chow, Y. S.; Christodoulou, V.; Chu, M. C.; Chudoba, J.; Chuinard, A. J.; Chwastowski, J. J.; Chytka, L.; Ciftci, A. K.; Cinca, D.; Cindro, V.; Cioara, I. A.; Ciocio, A.; Cirotto, F.; Citron, Z. H.; Citterio, M.; Ciubancan, M.; Clark, A.; Clark, B. L.; Clark, M. R.; Clark, P. J.; Clarke, R. N.; Clement, C.; Coadou, Y.; Cobal, M.; Coccaro, A.; Cochran, J.; Colasurdo, L.; Cole, B.; Colijn, A. P.; Collot, J.; Colombo, T.; Conde Muiño, P.; Coniavitis, E.; Connell, S. H.; Connelly, I. A.; Constantinescu, S.; Conti, G.; Conventi, F.; Cooke, M.; Cooper-Sarkar, A. M.; Cormier, F.; Cormier, K. J. R.; Corradi, M.; Corriveau, F.; Cortes-Gonzalez, A.; Costa, G.; Costa, M. J.; Costanzo, D.; Cottin, G.; Cowan, G.; Cox, B. E.; Cranmer, K.; Crawley, S. J.; Creager, R. A.; Cree, G.; Crépé-Renaudin, S.; Crescioli, F.; Cribbs, W. A.; Cristinziani, M.; Croft, V.; Crosetti, G.; Cueto, A.; Cuhadar Donszelmann, T.; Cukierman, A. R.; Cummings, J.; Curatolo, M.; Cúth, J.; Czekierda, S.; Czodrowski, P.; D'amen, G.; D'Auria, S.; D'eramo, L.; D'Onofrio, M.; Da Cunha Sargedas De Sousa, M. J.; Da Via, C.; Dabrowski, W.; Dado, T.; Dai, T.; Dale, O.; Dallaire, F.; Dallapiccola, C.; Dam, M.; Dandoy, J. R.; Daneri, M. F.; Dang, N. P.; Daniells, A. C.; Dann, N. S.; Danninger, M.; Dano Hoffmann, M.; Dao, V.; Darbo, G.; Darmora, S.; Dassoulas, J.; Dattagupta, A.; Daubney, T.; Davey, W.; David, C.; Davidek, T.; Davis, D. R.; Davison, P.; Dawe, E.; Dawson, I.; De, K.; de Asmundis, R.; De Benedetti, A.; De Castro, S.; De Cecco, S.; De Groot, N.; de Jong, P.; De la Torre, H.; De Lorenzi, F.; De Maria, A.; De Pedis, D.; De Salvo, A.; De Sanctis, U.; De Santo, A.; De Vasconcelos Corga, K.; De Vivie De Regie, J. B.; Debbe, R.; Debenedetti, C.; Dedovich, D. V.; Dehghanian, N.; Deigaard, I.; Del Gaudio, M.; Del Peso, J.; Delgove, D.; Deliot, F.; Delitzsch, C. M.; Dell'Acqua, A.; Dell'Asta, L.; Dell'Orso, M.; Della Pietra, M.; della Volpe, D.; Delmastro, M.; Delporte, C.; Delsart, P. A.; DeMarco, D. A.; Demers, S.; Demichev, M.; Demilly, A.; Denisov, S. P.; Denysiuk, D.; Derendarz, D.; Derkaoui, J. E.; Derue, F.; Dervan, P.; Desch, K.; Deterre, C.; Dette, K.; Devesa, M. R.; Deviveiros, P. O.; Dewhurst, A.; Dhaliwal, S.; Di Bello, F. A.; Di Ciaccio, A.; Di Ciaccio, L.; Di Clemente, W. K.; Di Donato, C.; Di Girolamo, A.; Di Girolamo, B.; Di Micco, B.; Di Nardo, R.; Di Petrillo, K. F.; Di Simone, A.; Di Sipio, R.; Di Valentino, D.; Diaconu, C.; Diamond, M.; Dias, F. A.; Diaz, M. A.; Diehl, E. B.; Dietrich, J.; Díez Cornell, S.; Dimitrievska, A.; Dingfelder, J.; Dita, P.; Dita, S.; Dittus, F.; Djama, F.; Djobava, T.; Djuvsland, J. I.; do Vale, M. A. B.; Dobos, D.; Dobre, M.; Dodsworth, D.; Doglioni, C.; Dolejsi, J.; Dolezal, Z.; Donadelli, M.; Donati, S.; Dondero, P.; Donini, J.; Dopke, J.; Doria, A.; Dova, M. T.; Doyle, A. T.; Drechsler, E.; Dris, M.; Du, Y.; Duarte-Campderros, J.; Dubinin, F.; Dubreuil, A.; Duchovni, E.; Duckeck, G.; Ducourthial, A.; Ducu, O. A.; Duda, D.; Dudarev, A.; Dudder, A. Chr.; Duffield, E. M.; Duflot, L.; Dührssen, M.; Dulsen, C.; Dumancic, M.; Dumitriu, A. E.; Duncan, A. K.; Dunford, M.; Duperrin, A.; Duran Yildiz, H.; Düren, M.; Durglishvili, A.; Duschinger, D.; Dutta, B.; Duvnjak, D.; Dyndal, M.; Dziedzic, B. S.; Eckardt, C.; Ecker, K. M.; Edgar, R. C.; Eifert, T.; Eigen, G.; Einsweiler, K.; Ekelof, T.; El Kacimi, M.; El Kosseifi, R.; Ellajosyula, V.; Ellert, M.; Elles, S.; Ellinghaus, F.; Elliot, A. A.; Ellis, N.; Elmsheuser, J.; Elsing, M.; Emeliyanov, D.; Enari, Y.; Ennis, J. S.; Epland, M. B.; Erdmann, J.; Ereditato, A.; Ernst, M.; Errede, S.; Escalier, M.; Escobar, C.; Esposito, B.; Estrada Pastor, O.; Etienvre, A. I.; Etzion, E.; Evans, H.; Ezhilov, A.; Ezzi, M.; Fabbri, F.; Fabbri, L.; Fabiani, V.; Facini, G.; Fakhrutdinov, R. M.; Falciano, S.; Falla, R. J.; Faltova, J.; Fang, Y.; Fanti, M.; Farbin, A.; Farilla, A.; Farina, C.; Farina, E. M.; Farooque, T.; Farrell, S.; Farrington, S. M.; Farthouat, P.; Fassi, F.; Fassnacht, P.; Fassouliotis, D.; Faucci Giannelli, M.; Favareto, A.; Fawcett, W. J.; Fayard, L.; Fedin, O. L.; Fedorko, W.; Feigl, S.; Feligioni, L.; Feng, C.; Feng, E. J.; Fenton, M. J.; Fenyuk, A. B.; Feremenga, L.; Fernandez Martinez, P.; Ferrando, J.; Ferrari, A.; Ferrari, P.; Ferrari, R.; Ferreira de Lima, D. E.; Ferrer, A.; Ferrere, D.; Ferretti, C.; Fiedler, F.; Filipčič, A.; Filipuzzi, M.; Filthaut, F.; Fincke-Keeler, M.; Finelli, K. D.; Fiolhais, M. C. N.; Fiorini, L.; Fischer, A.; Fischer, C.; Fischer, J.; Fisher, W. C.; Flaschel, N.; Fleck, I.; Fleischmann, P.; Fletcher, R. R. M.; Flick, T.; Flierl, B. M.; Flores Castillo, L. R.; Flowerdew, M. J.; Forcolin, G. T.; Formica, A.; Förster, F. A.; Forti, A.; Foster, A. G.; Fournier, D.; Fox, H.; Fracchia, S.; Francavilla, P.; Franchini, M.; Franchino, S.; Francis, D.; Franconi, L.; Franklin, M.; Frate, M.; Fraternali, M.; Freeborn, D.; Fressard-Batraneanu, S. M.; Freund, B.; Froidevaux, D.; Frost, J. A.; Fukunaga, C.; Fusayasu, T.; Fuster, J.; Gabizon, O.; Gabrielli, A.; Gabrielli, A.; Gach, G. P.; Gadatsch, S.; Gadomski, S.; Gagliardi, G.; Gagnon, L. G.; Galea, C.; Galhardo, B.; Gallas, E. J.; Gallop, B. J.; Gallus, P.; Galster, G.; Gan, K. K.; Ganguly, S.; Gao, Y.; Gao, Y. S.; Garay Walls, F. M.; García, C.; García Navarro, J. E.; García Pascual, J. A.; Garcia-Sciveres, M.; Gardner, R. W.; Garelli, N.; Garonne, V.; Gascon Bravo, A.; Gasnikova, K.; Gatti, C.; Gaudiello, A.; Gaudio, G.; Gavrilenko, I. L.; Gay, C.; Gaycken, G.; Gazis, E. N.; Gee, C. N. P.; Geisen, J.; Geisen, M.; Geisler, M. P.; Gellerstedt, K.; Gemme, C.; Genest, M. H.; Geng, C.; Gentile, S.; Gentsos, C.; George, S.; Gerbaudo, D.; Geßner, G.; Ghasemi, S.; Ghneimat, M.; Giacobbe, B.; Giagu, S.; Giangiacomi, N.; Giannetti, P.; Gibson, S. M.; Gignac, M.; Gilchriese, M.; Gillberg, D.; Gilles, G.; Gingrich, D. M.; Giordani, M. P.; Giorgi, F. M.; Giraud, P. F.; Giromini, P.; Giugliarelli, G.; Giugni, D.; Giuli, F.; Giuliani, C.; Giulini, M.; Gjelsten, B. K.; Gkaitatzis, S.; Gkialas, I.; Gkougkousis, E. L.; Gkountoumis, P.; Gladilin, L. K.; Glasman, C.; Glatzer, J.; Glaysher, P. C. F.; Glazov, A.; Goblirsch-Kolb, M.; Godlewski, J.; Goldfarb, S.; Golling, T.; Golubkov, D.; Gomes, A.; Gonçalo, R.; Goncalves Gama, R.; Goncalves Pinto Firmino Da Costa, J.; Gonella, G.; Gonella, L.; Gongadze, A.; Gonski, J. L.; González de la Hoz, S.; Gonzalez-Sevilla, S.; Goossens, L.; Gorbounov, P. A.; Gordon, H. A.; Gorelov, I.; Gorini, B.; Gorini, E.; Gorišek, A.; Goshaw, A. T.; Gössling, C.; Gostkin, M. I.; Gottardo, C. A.; Goudet, C. R.; Goujdami, D.; Goussiou, A. G.; Govender, N.; Gozani, E.; Grabowska-Bold, I.; Gradin, P. O. J.; Gramling, J.; Gramstad, E.; Grancagnolo, S.; Gratchev, V.; Gravila, P. M.; Gray, C.; Gray, H. M.; Greenwood, Z. D.; Grefe, C.; Gregersen, K.; Gregor, I. M.; Grenier, P.; Grevtsov, K.; Griffiths, J.; Grillo, A. A.; Grimm, K.; Grinstein, S.; Gris, Ph.; Grivaz, J.-F.; Groh, S.; Gross, E.; Grosse-Knetter, J.; Grossi, G. C.; Grout, Z. J.; Grummer, A.; Guan, L.; Guan, W.; Guenther, J.; Guescini, F.; Guest, D.; Gueta, O.; Gui, B.; Guido, E.; Guillemin, T.; Guindon, S.; Gul, U.; Gumpert, C.; Guo, J.; Guo, W.; Guo, Y.; Gupta, R.; Gurbuz, S.; Gustavino, G.; Gutelman, B. J.; Gutierrez, P.; Gutierrez Ortiz, N. G.; Gutschow, C.; Guyot, C.; Guzik, M. P.; Gwenlan, C.; Gwilliam, C. B.; Haas, A.; Haber, C.; Hadavand, H. K.; Haddad, N.; Hadef, A.; Hageböck, S.; Hagihara, M.; Hakobyan, H.; Haleem, M.; Haley, J.; Halladjian, G.; Hallewell, G. D.; Hamacher, K.; Hamal, P.; Hamano, K.; Hamilton, A.; Hamity, G. N.; Hamnett, P. G.; Han, L.; Han, S.; Hanagaki, K.; Hanawa, K.; Hance, M.; Handl, D. M.; Haney, B.; Hanke, P.; Hansen, J. B.; Hansen, J. D.; Hansen, M. C.; Hansen, P. H.; Hara, K.; Hard, A. S.; Harenberg, T.; Hariri, F.; Harkusha, S.; Harrison, P. F.; Hartmann, N. M.; Hasegawa, Y.; Hasib, A.; Hassani, S.; Haug, S.; Hauser, R.; Hauswald, L.; Havener, L. B.; Havranek, M.; Hawkes, C. M.; Hawkings, R. J.; Hayakawa, D.; Hayden, D.; Hays, C. P.; Hays, J. M.; Hayward, H. S.; Haywood, S. J.; Head, S. J.; Heck, T.; Hedberg, V.; Heelan, L.; Heer, S.; Heidegger, K. K.; Heim, S.; Heim, T.; Heinemann, B.; Heinrich, J. J.; Heinrich, L.; Heinz, C.; Hejbal, J.; Helary, L.; Held, A.; Hellman, S.; Helsens, C.; Henderson, R. C. W.; Heng, Y.; Henkelmann, S.; Henriques Correia, A. M.; Henrot-Versille, S.; Herbert, G. H.; Herde, H.; Herget, V.; Hernández Jiménez, Y.; Herr, H.; Herten, G.; Hertenberger, R.; Hervas, L.; Herwig, T. C.; Hesketh, G. G.; Hessey, N. P.; Hetherly, J. W.; Higashino, S.; Higón-Rodriguez, E.; Hildebrand, K.; Hill, E.; Hill, J. C.; Hiller, K. H.; Hillier, S. J.; Hils, M.; Hinchliffe, I.; Hirose, M.; Hirschbuehl, D.; Hiti, B.; Hladik, O.; Hlaluku, D. R.; Hoad, X.; Hobbs, J.; Hod, N.; Hodgkinson, M. C.; Hodgson, P.; Hoecker, A.; Hoeferkamp, M. R.; Hoenig, F.; Hohn, D.; Holmes, T. R.; Homann, M.; Honda, S.; Honda, T.; Hong, T. M.; Hooberman, B. H.; Hopkins, W. H.; Horii, Y.; Horton, A. J.; Hostachy, J.-Y.; Hostiuc, A.; Hou, S.; Hoummada, A.; Howarth, J.; Hoya, J.; Hrabovsky, M.; Hrdinka, J.; Hristova, I.; Hrivnac, J.; Hryn'ova, T.; Hrynevich, A.; Hsu, P. J.; Hsu, S.-C.; Hu, Q.; Hu, S.; Huang, Y.; Hubacek, Z.; Hubaut, F.; Huegging, F.; Huffman, T. B.; Hughes, E. W.; Huhtinen, M.; Hunter, R. F. H.; Huo, P.; Huseynov, N.; Huston, J.; Huth, J.; Hyneman, R.; Iacobucci, G.; Iakovidis, G.; Ibragimov, I.; Iconomidou-Fayard, L.; Idrissi, Z.; Iengo, P.; Igonkina, O.; Iizawa, T.; Ikegami, Y.; Ikeno, M.; Ilchenko, Y.; Iliadis, D.; Ilic, N.; Iltzsche, F.; Introzzi, G.; Ioannou, P.; Iodice, M.; Iordanidou, K.; Ippolito, V.; Isacson, M. F.; Ishijima, N.; Ishino, M.; Ishitsuka, M.; Issever, C.; Istin, S.; Ito, F.; Iturbe Ponce, J. M.; Iuppa, R.; Iwasaki, H.; Izen, J. M.; Izzo, V.; Jabbar, S.; Jackson, P.; Jacobs, R. M.; Jain, V.; Jakobi, K. B.; Jakobs, K.; Jakobsen, S.; Jakoubek, T.; Jamin, D. O.; Jana, D. K.; Jansky, R.; Janssen, J.; Janus, M.; Janus, P. A.; Jarlskog, G.; Javadov, N.; Javůrek, T.; Javurkova, M.; Jeanneau, F.; Jeanty, L.; Jejelava, J.; Jelinskas, A.; Jenni, P.; Jeske, C.; Jézéquel, S.; Ji, H.; Jia, J.; Jiang, H.; Jiang, Y.; Jiang, Z.; Jiggins, S.; Jimenez Pena, J.; Jin, S.; Jinaru, A.; Jinnouchi, O.; Jivan, H.; Johansson, P.; Johns, K. A.; Johnson, C. A.; Johnson, W. J.; Jon-And, K.; Jones, R. W. L.; Jones, S. D.; Jones, S.; Jones, T. J.; Jongmanns, J.; Jorge, P. M.; Jovicevic, J.; Ju, X.; Juste Rozas, A.; Köhler, M. K.; Kaczmarska, A.; Kado, M.; Kagan, H.; Kagan, M.; Kahn, S. J.; Kaji, T.; Kajomovitz, E.; Kalderon, C. W.; Kaluza, A.; Kama, S.; Kamenshchikov, A.; Kanaya, N.; Kanjir, L.; Kantserov, V. A.; Kanzaki, J.; Kaplan, B.; Kaplan, L. S.; Kar, D.; Karakostas, K.; Karastathis, N.; Kareem, M. J.; Karentzos, E.; Karpov, S. N.; Karpova, Z. M.; Karthik, K.; Kartvelishvili, V.; Karyukhin, A. N.; Kasahara, K.; Kashif, L.; Kass, R. D.; Kastanas, A.; Kataoka, Y.; Kato, C.; Katre, A.; Katzy, J.; Kawade, K.; Kawagoe, K.; Kawamoto, T.; Kawamura, G.; Kay, E. F.; Kazanin, V. F.; Keeler, R.; Kehoe, R.; Keller, J. S.; Kellermann, E.; Kempster, J. J.; Kendrick, J.; Keoshkerian, H.; Kepka, O.; Kerševan, B. P.; Kersten, S.; Keyes, R. A.; Khader, M.; Khalil-zada, F.; Khanov, A.; Kharlamov, A. G.; Kharlamova, T.; Khodinov, A.; Khoo, T. J.; Khovanskiy, V.; Khramov, E.; Khubua, J.; Kido, S.; Kilby, C. R.; Kim, H. Y.; Kim, S. H.; Kim, Y. K.; Kimura, N.; Kind, O. M.; King, B. T.; Kirchmeier, D.; Kirk, J.; Kiryunin, A. E.; Kishimoto, T.; Kisielewska, D.; Kitali, V.; Kivernyk, O.; Kladiva, E.; Klapdor-Kleingrothaus, T.; Klein, M. H.; Klein, M.; Klein, U.; Kleinknecht, K.; Klimek, P.; Klimentov, A.; Klingenberg, R.; Klingl, T.; Klioutchnikova, T.; Klitzner, F. F.; Kluge, E.-E.; Kluit, P.; Kluth, S.; Kneringer, E.; Knoops, E. B. F. G.; Knue, A.; Kobayashi, A.; Kobayashi, D.; Kobayashi, T.; Kobel, M.; Kocian, M.; Kodys, P.; Koffas, T.; Koffeman, E.; Köhler, N. M.; Koi, T.; Kolb, M.; Koletsou, I.; Komar, A. A.; Kondo, T.; Kondrashova, N.; Köneke, K.; König, A. C.; Kono, T.; Konoplich, R.; Konstantinidis, N.; Konya, B.; Kopeliansky, R.; Koperny, S.; Kopp, A. K.; Korcyl, K.; Kordas, K.; Korn, A.; Korol, A. A.; Korolkov, I.; Korolkova, E. V.; Kortner, O.; Kortner, S.; Kosek, T.; Kostyukhin, V. V.; Kotwal, A.; Koulouris, A.; Kourkoumeli-Charalampidi, A.; Kourkoumelis, C.; Kourlitis, E.; Kouskoura, V.; Kowalewska, A. B.; Kowalewski, R.; Kowalski, T. Z.; Kozakai, C.; Kozanecki, W.; Kozhin, A. S.; Kramarenko, V. A.; Kramberger, G.; Krasnopevtsev, D.; Krasny, M. W.; Krasznahorkay, A.; Krauss, D.; Kremer, J. A.; Kretzschmar, J.; Kreutzfeldt, K.; Krieger, P.; Krizka, K.; Kroeninger, K.; Kroha, H.; Kroll, J.; Kroll, J.; Kroseberg, J.; Krstic, J.; Kruchonak, U.; Krüger, H.; Krumnack, N.; Kruse, M. C.; Kubota, T.; Kucuk, H.; Kuday, S.; Kuechler, J. T.; Kuehn, S.; Kugel, A.; Kuger, F.; Kuhl, T.; Kukhtin, V.; Kukla, R.; Kulchitsky, Y.; Kuleshov, S.; Kulinich, Y. P.; Kuna, M.; Kunigo, T.; Kupco, A.; Kupfer, T.; Kuprash, O.; Kurashige, H.; Kurchaninov, L. L.; Kurochkin, Y. A.; Kurth, M. G.; Kuwertz, E. S.; Kuze, M.; Kvita, J.; Kwan, T.; Kyriazopoulos, D.; La Rosa, A.; La Rosa Navarro, J. L.; La Rotonda, L.; La Ruffa, F.; Lacasta, C.; Lacava, F.; Lacey, J.; Lack, D. P. J.; Lacker, H.; Lacour, D.; Ladygin, E.; Lafaye, R.; Laforge, B.; Lagouri, T.; Lai, S.; Lammers, S.; Lampl, W.; Lançon, E.; Landgraf, U.; Landon, M. P. J.; Lanfermann, M. C.; Lang, V. S.; Lange, J. C.; Langenberg, R. J.; Lankford, A. J.; Lanni, F.; Lantzsch, K.; Lanza, A.; Lapertosa, A.; Laplace, S.; Laporte, J. F.; Lari, T.; Lasagni Manghi, F.; Lassnig, M.; Lau, T. S.; Laurelli, P.; Lavrijsen, W.; Law, A. T.; Laycock, P.; Lazovich, T.; Lazzaroni, M.; Le, B.; Le Dortz, O.; Le Guirriec, E.; Le Quilleuc, E. P.; LeBlanc, M.; LeCompte, T.; Ledroit-Guillon, F.; Lee, C. A.; Lee, G. R.; Lee, S. C.; Lee, L.; Lefebvre, B.; Lefebvre, G.; Lefebvre, M.; Legger, F.; Leggett, C.; Lehmann Miotto, G.; Lei, X.; Leight, W. A.; Leite, M. A. L.; Leitner, R.; Lellouch, D.; Lemmer, B.; Leney, K. J. C.; Lenz, T.; Lenzi, B.; Leone, R.; Leone, S.; Leonidopoulos, C.; Lerner, G.; Leroy, C.; Les, R.; Lesage, A. A. J.; Lester, C. G.; Levchenko, M.; Levêque, J.; Levin, D.; Levinson, L. J.; Levy, M.; Lewis, D.; Li, B.; Li, C.; Li, H.; Li, L.; Li, Q.; Li, Q.; Li, S.; Li, X.; Li, Y.; Liang, Z.; Liberti, B.; Liblong, A.; Lie, K.; Liebal, J.; Liebig, W.; Limosani, A.; Lin, C. Y.; Lin, K.; Lin, S. C.; Lin, T. H.; Linck, R. A.; Lindquist, B. E.; Lionti, A. E.; Lipeles, E.; Lipniacka, A.; Lisovyi, M.; Liss, T. M.; Lister, A.; Litke, A. M.; Liu, B.; Liu, H.; Liu, H.; Liu, J. K. K.; Liu, J.; Liu, J. B.; Liu, K.; Liu, L.; Liu, M.; Liu, Y. L.; Liu, Y.; Livan, M.; Lleres, A.; Llorente Merino, J.; Lloyd, S. L.; Lo, C. Y.; Sterzo, F. Lo; Lobodzinska, E. M.; Loch, P.; Loebinger, F. K.; Loesle, A.; Loew, K. M.; Lohse, T.; Lohwasser, K.; Lokajicek, M.; Long, B. A.; Long, J. D.; Long, R. E.; Longo, L.; Looper, K. A.; Lopez, J. A.; Lopez Paz, I.; Lopez Solis, A.; Lorenz, J.; Lorenzo Martinez, N.; Losada, M.; Lösel, P. J.; Lou, X.; Lounis, A.; Love, J.; Love, P. A.; Lu, H.; Lu, N.; Lu, Y. J.; Lubatti, H. J.; Luci, C.; Lucotte, A.; Luedtke, C.; Luehring, F.; Lukas, W.; Luminari, L.; Lundberg, O.; Lund-Jensen, B.; Lutz, M. S.; Luzi, P. M.; Lynn, D.; Lysak, R.; Lytken, E.; Lyu, F.; Lyubushkin, V.; Ma, H.; Ma, L. L.; Ma, Y.; Maccarrone, G.; Macchiolo, A.; Macdonald, C. M.; Maček, B.; Machado Miguens, J.; Madaffari, D.; Madar, R.; Mader, W. F.; Madsen, A.; Madysa, N.; Maeda, J.; Maeland, S.; Maeno, T.; Maevskiy, A. S.; Magerl, V.; Maiani, C.; Maidantchik, C.; Maier, T.; Maio, A.; Majersky, O.; Majewski, S.; Makida, Y.; Makovec, N.; Malaescu, B.; Malecki, Pa.; Maleev, V. P.; Malek, F.; Mallik, U.; Malon, D.; Malone, C.; Maltezos, S.; Malyukov, S.; Mamuzic, J.; Mancini, G.; Mandić, I.; Maneira, J.; Manhaes de Andrade Filho, L.; Manjarres Ramos, J.; Mankinen, K. H.; Mann, A.; Manousos, A.; Mansoulie, B.; Mansour, J. D.; Mantifel, R.; Mantoani, M.; Manzoni, S.; Mapelli, L.; Marceca, G.; March, L.; Marchese, L.; Marchiori, G.; Marcisovsky, M.; Marin Tobon, C. A.; Marjanovic, M.; Marley, D. E.; Marroquim, F.; Marsden, S. P.; Marshall, Z.; Martensson, M. U. F.; Marti-Garcia, S.; Martin, C. B.; Martin, T. A.; Martin, V. J.; Martin dit Latour, B.; Martinez, M.; Martinez Outschoorn, V. I.; Martin-Haugh, S.; Martoiu, V. S.; Martyniuk, A. C.; Marzin, A.; Masetti, L.; Mashimo, T.; Mashinistov, R.; Masik, J.; Maslennikov, A. L.; Mason, L. H.; Massa, L.; Mastrandrea, P.; Mastroberardino, A.; Masubuchi, T.; Mättig, P.; Maurer, J.; Maxfield, S. J.; Maximov, D. A.; Mazini, R.; Maznas, I.; Mazza, S. M.; Mc Fadden, N. C.; Mc Goldrick, G.; Mc Kee, S. P.; McCarn, A.; McCarthy, R. L.; McCarthy, T. G.; McClymont, L. I.; McDonald, E. F.; Mcfayden, J. A.; Mchedlidze, G.; McMahon, S. J.; McNamara, P. C.; McNicol, C. J.; McPherson, R. A.; Meehan, S.; Megy, T. J.; Mehlhase, S.; Mehta, A.; Meideck, T.; Meier, K.; Meirose, B.; Melini, D.; Mellado Garcia, B. R.; Mellenthin, J. D.; Melo, M.; Meloni, F.; Melzer, A.; Menary, S. B.; Meng, L.; Meng, X. T.; Mengarelli, A.; Menke, S.; Meoni, E.; Mergelmeyer, S.; Merlassino, C.; Mermod, P.; Merola, L.; Meroni, C.; Merritt, F. S.; Messina, A.; Metcalfe, J.; Mete, A. S.; Meyer, C.; Meyer, J.-P.; Meyer, J.; Meyer Zu Theenhausen, H.; Miano, F.; Middleton, R. P.; Miglioranzi, S.; Mijović, L.; Mikenberg, G.; Mikestikova, M.; Mikuž, M.; Milesi, M.; Milic, A.; Millar, D. A.; Miller, D. W.; Mills, C.; Milov, A.; Milstead, D. A.; Minaenko, A. A.; Minami, Y.; Minashvili, I. A.; Mincer, A. I.; Mindur, B.; Mineev, M.; Minegishi, Y.; Ming, Y.; Mir, L. M.; Mirto, A.; Mistry, K. P.; Mitani, T.; Mitrevski, J.; Mitsou, V. A.; Miucci, A.; Miyagawa, P. S.; Mizukami, A.; Mjörnmark, J. U.; Mkrtchyan, T.; Mlynarikova, M.; Moa, T.; Mochizuki, K.; Mogg, P.; Mohapatra, S.; Molander, S.; Moles-Valls, R.; Mondragon, M. C.; Mönig, K.; Monk, J.; Monnier, E.; Montalbano, A.; Montejo Berlingen, J.; Monticelli, F.; Monzani, S.; Moore, R. W.; Morange, N.; Moreno, D.; Moreno Llácer, M.; Morettini, P.; Morgenstern, S.; Mori, D.; Mori, T.; Morii, M.; Morinaga, M.; Morisbak, V.; Morley, A. K.; Mornacchi, G.; Morris, J. D.; Morvaj, L.; Moschovakos, P.; Mosidze, M.; Moss, H. J.; Moss, J.; Motohashi, K.; Mount, R.; Mountricha, E.; Moyse, E. J. W.; Muanza, S.; Mueller, F.; Mueller, J.; Mueller, R. S. P.; Muenstermann, D.; Mullen, P.; Mullier, G. A.; Munoz Sanchez, F. J.; Murray, W. J.; Musheghyan, H.; Muškinja, M.; Myagkov, A. G.; Myska, M.; Nachman, B. P.; Nackenhorst, O.; Nagai, K.; Nagai, R.; Nagano, K.; Nagasaka, Y.; Nagata, K.; Nagel, M.; Nagy, E.; Nairz, A. M.; Nakahama, Y.; Nakamura, K.; Nakamura, T.; Nakano, I.; Naranjo Garcia, R. F.; Narayan, R.; Narrias Villar, D. I.; Naryshkin, I.; Naumann, T.; Navarro, G.; Nayyar, R.; Neal, H. A.; Nechaeva, P. Yu.; Neep, T. J.; Negri, A.; Negrini, M.; Nektarijevic, S.; Nellist, C.; Nelson, A.; Nelson, M. E.; Nemecek, S.; Nemethy, P.; Nessi, M.; Neubauer, M. S.; Neumann, M.; Newman, P. R.; Ng, T. Y.; Ng, Y. S.; Nguyen Manh, T.; Nickerson, R. B.; Nicolaidou, R.; Nielsen, J.; Nikiforou, N.; Nikolaenko, V.; Nikolic-Audit, I.; Nikolopoulos, K.; Nilsen, J. K.; Nilsson, P.; Ninomiya, Y.; Nisati, A.; Nishu, N.; Nisius, R.; Nitsche, I.; Nitta, T.; Nobe, T.; Noguchi, Y.; Nomachi, M.; Nomidis, I.; Nomura, M. A.; Nooney, T.; Nordberg, M.; Norjoharuddeen, N.; Novgorodova, O.; Nozaki, M.; Nozka, L.; Ntekas, K.; Nurse, E.; Nuti, F.; O'connor, K.; O'Neil, D. C.; O'Rourke, A. A.; O'Shea, V.; Oakham, F. G.; Oberlack, H.; Obermann, T.; Ocariz, J.; Ochi, A.; Ochoa, I.; Ochoa-Ricoux, J. P.; Oda, S.; Odaka, S.; Oh, A.; Oh, S. H.; Ohm, C. C.; Ohman, H.; Oide, H.; Okawa, H.; Okumura, Y.; Okuyama, T.; Olariu, A.; Oleiro Seabra, L. F.; Olivares Pino, S. A.; Oliveira Damazio, D.; Olszewski, A.; Olszowska, J.; Onofre, A.; Onogi, K.; Onyisi, P. U. E.; Oppen, H.; Oreglia, M. J.; Oren, Y.; Orestano, D.; Orlando, N.; Orr, R. S.; Osculati, B.; Ospanov, R.; Otero y Garzon, G.; Otono, H.; Ouchrif, M.; Ould-Saada, F.; Ouraou, A.; Oussoren, K. P.; Ouyang, Q.; Owen, M.; Owen, R. E.; Ozcan, V. E.; Ozturk, N.; Pachal, K.; Pacheco Pages, A.; Pacheco Rodriguez, L.; Padilla Aranda, C.; Pagan Griso, S.; Paganini, M.; Paige, F.; Palacino, G.; Palazzo, S.; Palestini, S.; Palka, M.; Pallin, D.; Panagiotopoulou, E. St.; Panagoulias, I.; Pandini, C. E.; Panduro Vazquez, J. G.; Pani, P.; Panitkin, S.; Pantea, D.; Paolozzi, L.; Papadopoulou, Th. D.; Papageorgiou, K.; Paramonov, A.; Paredes Hernandez, D.; Parker, A. J.; Parker, M. A.; Parker, K. A.; Parodi, F.; Parsons, J. A.; Parzefall, U.; Pascuzzi, V. R.; Pasner, J. M.; Pasqualucci, E.; Passaggio, S.; Pastore, Fr.; Pataraia, S.; Pater, J. R.; Pauly, T.; Pearson, B.; Pedraza Lopez, S.; Pedro, R.; Peleganchuk, S. V.; Penc, O.; Peng, C.; Peng, H.; Penwell, J.; Peralva, B. S.; Perego, M. M.; Perepelitsa, D. V.; Peri, F.; Perini, L.; Pernegger, H.; Perrella, S.; Peschke, R.; Peshekhonov, V. D.; Peters, K.; Peters, R. F. Y.; Petersen, B. A.; Petersen, T. C.; Petit, E.; Petridis, A.; Petridou, C.; Petroff, P.; Petrolo, E.; Petrov, M.; Petrucci, F.; Pettersson, N. E.; Peyaud, A.; Pezoa, R.; Phillips, F. H.; Phillips, P. W.; Piacquadio, G.; Pianori, E.; Picazio, A.; Pickering, M. A.; Piegaia, R.; Pilcher, J. E.; Pilkington, A. D.; Pinamonti, M.; Pinfold, J. L.; Pirumov, H.; Pitt, M.; Plazak, L.; Pleier, M.-A.; Pleskot, V.; Plotnikova, E.; Pluth, D.; Podberezko, P.; Poettgen, R.; Poggi, R.; Poggioli, L.; Pogrebnyak, I.; Pohl, D.; Pokharel, I.; Polesello, G.; Poley, A.; Policicchio, A.; Polifka, R.; Polini, A.; Pollard, C. S.; Polychronakos, V.; Pommès, K.; Ponomarenko, D.; Pontecorvo, L.; Popeneciu, G. A.; Portillo Quintero, D. M.; Pospisil, S.; Potamianos, K.; Potrap, I. N.; Potter, C. J.; Potti, H.; Poulsen, T.; Poveda, J.; Pozo Astigarraga, M. E.; Pralavorio, P.; Pranko, A.; Prell, S.; Price, D.; Primavera, M.; Prince, S.; Proklova, N.; Prokofiev, K.; Prokoshin, F.; Protopopescu, S.; Proudfoot, J.; Przybycien, M.; Puri, A.; Puzo, P.; Qian, J.; Qin, G.; Qin, Y.; Quadt, A.; Queitsch-Maitland, M.; Quilty, D.; Raddum, S.; Radeka, V.; Radescu, V.; Radhakrishnan, S. K.; Radloff, P.; Rados, P.; Ragusa, F.; Rahal, G.; Raine, J. A.; Rajagopalan, S.; Rangel-Smith, C.; Rashid, T.; Raspopov, S.; Ratti, M. G.; Rauch, D. M.; Rauscher, F.; Rave, S.; Ravinovich, I.; Rawling, J. H.; Raymond, M.; Read, A. L.; Readioff, N. P.; Reale, M.; Rebuzzi, D. M.; Redelbach, A.; Redlinger, G.; Reece, R.; Reed, R. G.; Reeves, K.; Rehnisch, L.; Reichert, J.; Reiss, A.; Rembser, C.; Ren, H.; Rescigno, M.; Resconi, S.; Resseguie, E. D.; Rettie, S.; Reynolds, E.; Rezanova, O. L.; Reznicek, P.; Rezvani, R.; Richter, R.; Richter, S.; Richter-Was, E.; Ricken, O.; Ridel, M.; Rieck, P.; Riegel, C. J.; Rieger, J.; Rifki, O.; Rijssenbeek, M.; Rimoldi, A.; Rimoldi, M.; Rinaldi, L.; Ripellino, G.; Ristić, B.; Ritsch, E.; Riu, I.; Rizatdinova, F.; Rizvi, E.; Rizzi, C.; Roberts, R. T.; Robertson, S. H.; Robichaud-Veronneau, A.; Robinson, D.; Robinson, J. E. M.; Robson, A.; Rocco, E.; Roda, C.; Rodina, Y.; Rodriguez Bosca, S.; Rodriguez Perez, A.; Rodriguez Rodriguez, D.; Roe, S.; Rogan, C. S.; Røhne, O.; Roloff, J.; Romaniouk, A.; Romano, M.; Romano Saez, S. M.; Romero Adam, E.; Rompotis, N.; Ronzani, M.; Roos, L.; Rosati, S.; Rosbach, K.; Rose, P.; Rosien, N.-A.; Rossi, E.; Rossi, L. P.; Rosten, J. H. N.; Rosten, R.; Rotaru, M.; Rothberg, J.; Rousseau, D.; Rozanov, A.; Rozen, Y.; Ruan, X.; Rubbo, F.; Rühr, F.; Ruiz-Martinez, A.; Rurikova, Z.; Rusakovich, N. A.; Russell, H. L.; Rutherfoord, J. P.; Ruthmann, N.; Rüttinger, E. M.; Ryabov, Y. F.; Rybar, M.; Rybkin, G.; Ryu, S.; Ryzhov, A.; Rzehorz, G. F.; Saavedra, A. F.; Sabato, G.; Sacerdoti, S.; Sadrozinski, H. F.-W.; Sadykov, R.; Safai Tehrani, F.; Saha, P.; Sahinsoy, M.; Saimpert, M.; Saito, M.; Saito, T.; Sakamoto, H.; Sakurai, Y.; Salamanna, G.; Salazar Loyola, J. E.; Salek, D.; Sales De Bruin, P. H.; Salihagic, D.; Salnikov, A.; Salt, J.; Salvatore, D.; Salvatore, F.; Salvucci, A.; Salzburger, A.; Sammel, D.; Sampsonidis, D.; Sampsonidou, D.; Sánchez, J.; Sanchez Martinez, V.; Sanchez Pineda, A.; Sandaker, H.; Sandbach, R. L.; Sander, C. O.; Sandhoff, M.; Sandoval, C.; Sankey, D. P. C.; Sannino, M.; Sano, Y.; Sansoni, A.; Santoni, C.; Santos, H.; Santoyo Castillo, I.; Sapronov, A.; Saraiva, J. G.; Sarrazin, B.; Sasaki, O.; Sato, K.; Sauvan, E.; Savage, G.; Savard, P.; Savic, N.; Sawyer, C.; Sawyer, L.; Saxon, J.; Sbarra, C.; Sbrizzi, A.; Scanlon, T.; Scannicchio, D. A.; Schaarschmidt, J.; Schacht, P.; Schachtner, B. M.; Schaefer, D.; Schaefer, L.; Schaefer, R.; Schaeffer, J.; Schaepe, S.; Schaetzel, S.; Schäfer, U.; Schaffer, A. C.; Schaile, D.; Schamberger, R. D.; Schegelsky, V. A.; Scheirich, D.; Schernau, M.; Schiavi, C.; Schier, S.; Schildgen, L. K.; Schillo, C.; Schioppa, M.; Schlenker, S.; Schmidt-Sommerfeld, K. R.; Schmieden, K.; Schmitt, C.; Schmitt, S.; Schmitz, S.; Schnoor, U.; Schoeffel, L.; Schoening, A.; Schoenrock, B. D.; Schopf, E.; Schott, M.; Schouwenberg, J. F. P.; Schovancova, J.; Schramm, S.; Schuh, N.; Schulte, A.; Schultens, M. J.; Schultz-Coulon, H.-C.; Schulz, H.; Schumacher, M.; Schumm, B. A.; Schune, Ph.; Schwartzman, A.; Schwarz, T. A.; Schweiger, H.; Schwemling, Ph.; Schwienhorst, R.; Schwindling, J.; Sciandra, A.; Sciolla, G.; Scornajenghi, M.; Scuri, F.; Scutti, F.; Searcy, J.; Seema, P.; Seidel, S. C.; Seiden, A.; Seixas, J. M.; Sekhniaidze, G.; Sekhon, K.; Sekula, S. J.; Semprini-Cesari, N.; Senkin, S.; Serfon, C.; Serin, L.; Serkin, L.; Sessa, M.; Seuster, R.; Severini, H.; Sfiligoj, T.; Sforza, F.; Sfyrla, A.; Shabalina, E.; Shaikh, N. W.; Shan, L. Y.; Shang, R.; Shank, J. T.; Shapiro, M.; Shatalov, P. B.; Shaw, K.; Shaw, S. M.; Shcherbakova, A.; Shehu, C. Y.; Shen, Y.; Sherafati, N.; Sherman, A. D.; Sherwood, P.; Shi, L.; Shimizu, S.; Shimmin, C. O.; Shimojima, M.; Shipsey, I. P. J.; Shirabe, S.; Shiyakova, M.; Shlomi, J.; Shmeleva, A.; Shoaleh Saadi, D.; Shochet, M. J.; Shojaii, S.; Shope, D. R.; Shrestha, S.; Shulga, E.; Shupe, M. A.; Sicho, P.; Sickles, A. M.; Sidebo, P. E.; Sideras Haddad, E.; Sidiropoulou, O.; Sidoti, A.; Siegert, F.; Sijacki, Dj.; Silva, J.; Silverstein, S. B.; Simak, V.; Simic, L.; Simion, S.; Simioni, E.; Simmons, B.; Simon, M.; Sinervo, P.; Sinev, N. B.; Sioli, M.; Siragusa, G.; Siral, I.; Sivoklokov, S. Yu.; Sjölin, J.; Skinner, M. B.; Skubic, P.; Slater, M.; Slavicek, T.; Slawinska, M.; Sliwa, K.; Slovak, R.; Smakhtin, V.; Smart, B. H.; Smiesko, J.; Smirnov, N.; Smirnov, S. Yu.; Smirnov, Y.; Smirnova, L. N.; Smirnova, O.; Smith, J. W.; Smith, M. N. K.; Smith, R. W.; Smizanska, M.; Smolek, K.; Snesarev, A. A.; Snyder, I. M.; Snyder, S.; Sobie, R.; Socher, F.; Soffer, A.; Søgaard, A.; Soh, D. A.; Sokhrannyi, G.; Solans Sanchez, C. A.; Solar, M.; Soldatov, E. Yu.; Soldevila, U.; Solodkov, A. A.; Soloshenko, A.; Solovyanov, O. V.; Solovyev, V.; Sommer, P.; Son, H.; Sopczak, A.; Sosa, D.; Sotiropoulou, C. L.; Sottocornola, S.; Soualah, R.; Soukharev, A. M.; South, D.; Sowden, B. C.; Spagnolo, S.; Spalla, M.; Spangenberg, M.; Spanò, F.; Sperlich, D.; Spettel, F.; Spieker, T. M.; Spighi, R.; Spigo, G.; Spiller, L. A.; Spousta, M.; St. Denis, R. D.; Stabile, A.; Stamen, R.; Stamm, S.; Stanecka, E.; Stanek, R. W.; Stanescu, C.; Stanitzki, M. M.; Stapf, B. S.; Stapnes, S.; Starchenko, E. A.; Stark, G. H.; Stark, J.; Stark, S. H.; Staroba, P.; Starovoitov, P.; Stärz, S.; Staszewski, R.; Stegler, M.; Steinberg, P.; Stelzer, B.; Stelzer, H. J.; Stelzer-Chilton, O.; Stenzel, H.; Stevenson, T. J.; Stewart, G. A.; Stockton, M. C.; Stoebe, M.; Stoicea, G.; Stolte, P.; Stonjek, S.; Stradling, A. R.; Straessner, A.; Stramaglia, M. E.; Strandberg, J.; Strandberg, S.; Strauss, M.; Strizenec, P.; Ströhmer, R.; Strom, D. M.; Stroynowski, R.; Strubig, A.; Stucci, S. A.; Stugu, B.; Styles, N. A.; Su, D.; Su, J.; Suchek, S.; Sugaya, Y.; Suk, M.; Sulin, V. V.; Sultan, DMS; Sultansoy, S.; Sumida, T.; Sun, S.; Sun, X.; Suruliz, K.; Suster, C. J. E.; Sutton, M. R.; Suzuki, S.; Svatos, M.; Swiatlowski, M.; Swift, S. P.; Sykora, I.; Sykora, T.; Ta, D.; Tackmann, K.; Taenzer, J.; Taffard, A.; Tafirout, R.; Tahirovic, E.; Taiblum, N.; Takai, H.; Takashima, R.; Takasugi, E. H.; Takeda, K.; Takeshita, T.; Takubo, Y.; Talby, M.; Talyshev, A. A.; Tanaka, J.; Tanaka, M.; Tanaka, R.; Tanaka, S.; Tanioka, R.; Tannenwald, B. B.; Tapia Araya, S.; Tapprogge, S.; Tarem, S.; Tartarelli, G. F.; Tas, P.; Tasevsky, M.; Tashiro, T.; Tassi, E.; Tavares Delgado, A.; Tayalati, Y.; Taylor, A. C.; Taylor, A. J.; Taylor, G. N.; Taylor, P. T. E.; Taylor, W.; Teixeira-Dias, P.; Temple, D.; Ten Kate, H.; Teng, P. K.; Teoh, J. J.; Tepel, F.; Terada, S.; Terashi, K.; Terron, J.; Terzo, S.; Testa, M.; Teuscher, R. J.; Thais, S. J.; Theveneaux-Pelzer, T.; Thiele, F.; Thomas, J. P.; Thomas-Wilsker, J.; Thompson, P. D.; Thompson, A. S.; Thomsen, L. A.; Thomson, E.; Tian, Y.; Tibbetts, M. J.; Ticse Torres, R. E.; Tikhomirov, V. O.; Tikhonov, Yu. A.; Timoshenko, S.; Tipton, P.; Tisserant, S.; Todome, K.; Todorova-Nova, S.; Todt, S.; Tojo, J.; Tokár, S.; Tokushuku, K.; Tolley, E.; Tomlinson, L.; Tomoto, M.; Tompkins, L.; Toms, K.; Tong, B.; Tornambe, P.; Torrence, E.; Torres, H.; Torró Pastor, E.; Toth, J.; Touchard, F.; Tovey, D. R.; Treado, C. J.; Trefzger, T.; Tresoldi, F.; Tricoli, A.; Trigger, I. M.; Trincaz-Duvoid, S.; Tripiana, M. F.; Trischuk, W.; Trocmé, B.; Trofymov, A.; Troncon, C.; Trottier-McDonald, M.; Trovatelli, M.; Truong, L.; Trzebinski, M.; Trzupek, A.; Tsang, K. W.; Tseng, J. C.-L.; Tsiareshka, P. V.; Tsipolitis, G.; Tsirintanis, N.; Tsiskaridze, S.; Tsiskaridze, V.; Tskhadadze, E. G.; Tsukerman, I. I.; Tsulaia, V.; Tsuno, S.; Tsybychev, D.; Tu, Y.; Tudorache, A.; Tudorache, V.; Tulbure, T. T.; Tuna, A. N.; Turchikhin, S.; Turgeman, D.; Turk Cakir, I.; Turra, R.; Tuts, P. M.; Ucchielli, G.; Ueda, I.; Ughetto, M.; Ukegawa, F.; Unal, G.; Undrus, A.; Unel, G.; Ungaro, F. C.; Unno, Y.; Uno, K.; Unverdorben, C.; Urban, J.; Urquijo, P.; Urrejola, P.; Usai, G.; Usui, J.; Vacavant, L.; Vacek, V.; Vachon, B.; Vadla, K. O. H.; Vaidya, A.; Valderanis, C.; Valdes Santurio, E.; Valente, M.; Valentinetti, S.; Valero, A.; Valéry, L.; Valkar, S.; Vallier, A.; Valls Ferrer, J. A.; Van Den Wollenberg, W.; van der Graaf, H.; van Gemmeren, P.; Van Nieuwkoop, J.; van Vulpen, I.; van Woerden, M. C.; Vanadia, M.; Vandelli, W.; Vaniachine, A.; Vankov, P.; Vardanyan, G.; Vari, R.; Varnes, E. W.; Varni, C.; Varol, T.; Varouchas, D.; Vartapetian, A.; Varvell, K. E.; Vasquez, J. G.; Vasquez, G. A.; Vazeille, F.; Vazquez Furelos, D.; Vazquez Schroeder, T.; Veatch, J.; Veeraraghavan, V.; Veloce, L. M.; Veloso, F.; Veneziano, S.; Ventura, A.; Venturi, M.; Venturi, N.; Venturini, A.; Vercesi, V.; Verducci, M.; Verkerke, W.; Vermeulen, A. T.; Vermeulen, J. C.; Vetterli, M. C.; Viaux Maira, N.; Viazlo, O.; Vichou, I.; Vickey, T.; Vickey Boeriu, O. E.; Viehhauser, G. H. A.; Viel, S.; Vigani, L.; Villa, M.; Villaplana Perez, M.; Vilucchi, E.; Vincter, M. G.; Vinogradov, V. B.; Vishwakarma, A.; Vittori, C.; Vivarelli, I.; Vlachos, S.; Vogel, M.; Vokac, P.; Volpi, G.; von der Schmitt, H.; von Toerne, E.; Vorobel, V.; Vorobev, K.; Vos, M.; Voss, R.; Vossebeld, J. H.; Vranjes, N.; Vranjes Milosavljevic, M.; Vrba, V.; Vreeswijk, M.; Vuillermet, R.; Vukotic, I.; Wagner, P.; Wagner, W.; Wagner-Kuhr, J.; Wahlberg, H.; Wahrmund, S.; Walder, J.; Walker, R.; Walkowiak, W.; Wallangen, V.; Wang, C.; Wang, C.; Wang, F.; Wang, H.; Wang, H.; Wang, J.; Wang, J.; Wang, Q.; Wang, R.-J.; Wang, R.; Wang, S. M.; Wang, T.; Wang, W.; Wang, W.; Wang, Z.; Wanotayaroj, C.; Warburton, A.; Ward, C. P.; Wardrope, D. R.; Washbrook, A.; Watkins, P. M.; Watson, A. T.; Watson, M. F.; Watts, G.; Watts, S.; Waugh, B. M.; Webb, A. F.; Webb, S.; Weber, M. S.; Weber, S. M.; Weber, S. W.; Weber, S. A.; Webster, J. S.; Weidberg, A. R.; Weinert, B.; Weingarten, J.; Weirich, M.; Weiser, C.; Weits, H.; Wells, P. S.; Wenaus, T.; Wengler, T.; Wenig, S.; Wermes, N.; Werner, M. D.; Werner, P.; Wessels, M.; Weston, T. D.; Whalen, K.; Whallon, N. L.; Wharton, A. M.; White, A. S.; White, A.; White, M. J.; White, R.; Whiteson, D.; Whitmore, B. W.; Wickens, F. J.; Wiedenmann, W.; Wielers, M.; Wiglesworth, C.; Wiik-Fuchs, L. A. M.; Wildauer, A.; Wilk, F.; Wilkens, H. G.; Williams, H. H.; Williams, S.; Willis, C.; Willocq, S.; Wilson, J. A.; Wingerter-Seez, I.; Winkels, E.; Winklmeier, F.; Winston, O. J.; Winter, B. T.; Wittgen, M.; Wobisch, M.; Wolf, T. M. H.; Wolff, R.; Wolter, M. W.; Wolters, H.; Wong, V. W. S.; Woods, N. L.; Worm, S. D.; Wosiek, B. K.; Wotschack, J.; Wozniak, K. W.; Wu, M.; Wu, S. L.; Wu, X.; Wu, Y.; Wyatt, T. R.; Wynne, B. M.; Xella, S.; Xi, Z.; Xia, L.; Xu, D.; Xu, L.; Xu, T.; Xu, W.; Yabsley, B.; Yacoob, S.; Yamaguchi, D.; Yamaguchi, Y.; Yamamoto, A.; Yamamoto, S.; Yamanaka, T.; Yamane, F.; Yamatani, M.; Yamazaki, T.; Yamazaki, Y.; Yan, Z.; Yang, H.; Yang, H.; Yang, Y.; Yang, Z.; Yao, W.-M.; Yap, Y. C.; Yasu, Y.; Yatsenko, E.; Yau Wong, K. H.; Ye, J.; Ye, S.; Yeletskikh, I.; Yigitbasi, E.; Yildirim, E.; Yorita, K.; Yoshihara, K.; Young, C.; Young, C. J. S.; Yu, J.; Yu, J.; Yuen, S. P. Y.; Yusuff, I.; Zabinski, B.; Zacharis, G.; Zaidan, R.; Zaitsev, A. M.; Zakharchuk, N.; Zalieckas, J.; Zaman, A.; Zambito, S.; Zanzi, D.; Zeitnitz, C.; Zemaityte, G.; Zemla, A.; Zeng, J. C.; Zeng, Q.; Zenin, O.; Ženiš, T.; Zerwas, D.; Zhang, D.; Zhang, D.; Zhang, F.; Zhang, G.; Zhang, H.; Zhang, J.; Zhang, L.; Zhang, L.; Zhang, M.; Zhang, P.; Zhang, R.; Zhang, R.; Zhang, X.; Zhang, Y.; Zhang, Z.; Zhao, X.; Zhao, Y.; Zhao, Z.; Zhemchugov, A.; Zhou, B.; Zhou, C.; Zhou, L.; Zhou, M.; Zhou, M.; Zhou, N.; Zhou, Y.; Zhu, C. G.; Zhu, H.; Zhu, J.; Zhu, Y.; Zhuang, X.; Zhukov, K.; Zibell, A.; Zieminska, D.; Zimine, N. I.; Zimmermann, C.; Zimmermann, S.; Zinonos, Z.; Zinser, M.; Ziolkowski, M.; Živković, L.; Zobernig, G.; Zoccoli, A.; Zou, R.; zur Nedden, M.; Zwalinski, L.
2017-11-01
This paper presents single lepton and dilepton kinematic distributions measured in dileptonic t\\bar{t} events produced in 20.2fb^{-1} of √{s}=8 TeV pp collisions recorded by the ATLAS experiment at the LHC. Both absolute and normalised differential cross-sections are measured, using events with an opposite-charge eμ pair and one or two b-tagged jets. The cross-sections are measured in a fiducial region corresponding to the detector acceptance for leptons, and are compared to the predictions from a variety of Monte Carlo event generators, as well as fixed-order QCD calculations, exploring the sensitivity of the cross-sections to the gluon parton distribution function. Some of the distributions are also sensitive to the top quark pole mass; a combined fit of NLO fixed-order predictions to all the measured distributions yields a top quark mass value of {m_t^{pole}}=173.2± 0.9± 0.8± 1.2 GeV, where the three uncertainties arise from data statistics, experimental systematics, and theoretical sources.
Aaboud, M.; Aad, G.; Abbott, B.; ...
2017-11-25
We present single lepton and dilepton kinematic distributions measured in dileptonic tmore » $$\\bar{t}$$ events produced in 20.2fb - 1 of √s=8 TeV pp collisions recorded by the ATLAS experiment at the LHC. Both absolute and normalised differential cross-sections are measured, using events with an opposite-charge eμ pair and one or two b-tagged jets. Furthermore, the cross-sections are measured in a fiducial region corresponding to the detector acceptance for leptons, and are compared to the predictions from a variety of Monte Carlo event generators, as well as fixed-order QCD calculations, exploring the sensitivity of the cross-sections to the gluon parton distribution function. Some of the distributions are also sensitive to the top quark pole mass; a combined fit of NLO fixed-order predictions to all the measured distributions yields a top quark mass value of m$$pole\\atop{t}$$=173.2±0.9±0.8±1.2 GeV, where the three uncertainties arise from data statistics, experimental systematics, and theoretical sources.« less
Data Assimilation - Advances and Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Brian J.
2014-07-30
This presentation provides an overview of data assimilation (model calibration) for complex computer experiments. Calibration refers to the process of probabilistically constraining uncertain physics/engineering model inputs to be consistent with observed experimental data. An initial probability distribution for these parameters is updated using the experimental information. Utilization of surrogate models and empirical adjustment for model form error in code calibration form the basis for the statistical methodology considered. The role of probabilistic code calibration in supporting code validation is discussed. Incorporation of model form uncertainty in rigorous uncertainty quantification (UQ) analyses is also addressed. Design criteria used within a batchmore » sequential design algorithm are introduced for efficiently achieving predictive maturity and improved code calibration. Predictive maturity refers to obtaining stable predictive inference with calibrated computer codes. These approaches allow for augmentation of initial experiment designs for collecting new physical data. A standard framework for data assimilation is presented and techniques for updating the posterior distribution of the state variables based on particle filtering and the ensemble Kalman filter are introduced.« less
Estimation of Global Network Statistics from Incomplete Data
Bliss, Catherine A.; Danforth, Christopher M.; Dodds, Peter Sheridan
2014-01-01
Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week. PMID:25338183
Probabilistic micromechanics for metal matrix composites
NASA Astrophysics Data System (ADS)
Engelstad, S. P.; Reddy, J. N.; Hopkins, Dale A.
A probabilistic micromechanics-based nonlinear analysis procedure is developed to predict and quantify the variability in the properties of high temperature metal matrix composites. Monte Carlo simulation is used to model the probabilistic distributions of the constituent level properties including fiber, matrix, and interphase properties, volume and void ratios, strengths, fiber misalignment, and nonlinear empirical parameters. The procedure predicts the resultant ply properties and quantifies their statistical scatter. Graphite copper and Silicon Carbide Titanlum Aluminide (SCS-6 TI15) unidirectional plies are considered to demonstrate the predictive capabilities. The procedure is believed to have a high potential for use in material characterization and selection to precede and assist in experimental studies of new high temperature metal matrix composites.
Exact Extremal Statistics in the Classical 1D Coulomb Gas
NASA Astrophysics Data System (ADS)
Dhar, Abhishek; Kundu, Anupam; Majumdar, Satya N.; Sabhapandit, Sanjib; Schehr, Grégory
2017-08-01
We consider a one-dimensional classical Coulomb gas of N -like charges in a harmonic potential—also known as the one-dimensional one-component plasma. We compute, analytically, the probability distribution of the position xmax of the rightmost charge in the limit of large N . We show that the typical fluctuations of xmax around its mean are described by a nontrivial scaling function, with asymmetric tails. This distribution is different from the Tracy-Widom distribution of xmax for Dyson's log gas. We also compute the large deviation functions of xmax explicitly and show that the system exhibits a third-order phase transition, as in the log gas. Our theoretical predictions are verified numerically.
Stochastic Growth Theory of Spatially-Averaged Distributions of Langmuir Fields in Earth's Foreshock
NASA Technical Reports Server (NTRS)
Boshuizen, Christopher R.; Cairns, Iver H.; Robinson, P. A.
2001-01-01
Langmuir-like waves in the foreshock of Earth are characteristically bursty and irregular, and are the subject of a number of recent studies. Averaged over the foreshock, it is observed that the probability distribution is power-law P(bar)(log E) in the wave field E with the bar denoting this averaging over position, In this paper it is shown that stochastic growth theory (SGT) can explain a power-law spatially-averaged distributions P(bar)(log E), when the observed power-law variations of the mean and standard deviation of log E with position are combined with the log normal statistics predicted by SGT at each location.
Spatial analysis techniques applied to uranium prospecting in Chihuahua State, Mexico
NASA Astrophysics Data System (ADS)
Hinojosa de la Garza, Octavio R.; Montero Cabrera, María Elena; Sanín, Luz H.; Reyes Cortés, Manuel; Martínez Meyer, Enrique
2014-07-01
To estimate the distribution of uranium minerals in Chihuahua, the advanced statistical model "Maximun Entropy Method" (MaxEnt) was applied. A distinguishing feature of this method is that it can fit more complex models in case of small datasets (x and y data), as is the location of uranium ores in the State of Chihuahua. For georeferencing uranium ores, a database from the United States Geological Survey and workgroup of experts in Mexico was used. The main contribution of this paper is the proposal of maximum entropy techniques to obtain the mineral's potential distribution. For this model were used 24 environmental layers like topography, gravimetry, climate (worldclim), soil properties and others that were useful to project the uranium's distribution across the study area. For the validation of the places predicted by the model, comparisons were done with other research of the Mexican Service of Geological Survey, with direct exploration of specific areas and by talks with former exploration workers of the enterprise "Uranio de Mexico". Results. New uranium areas predicted by the model were validated, finding some relationship between the model predictions and geological faults. Conclusions. Modeling by spatial analysis provides additional information to the energy and mineral resources sectors.
Visualization of the significance of Receiver Operating Characteristics based on confidence ellipses
NASA Astrophysics Data System (ADS)
Sarlis, Nicholas V.; Christopoulos, Stavros-Richard G.
2014-03-01
The Receiver Operating Characteristics (ROC) is used for the evaluation of prediction methods in various disciplines like meteorology, geophysics, complex system physics, medicine etc. The estimation of the significance of a binary prediction method, however, remains a cumbersome task and is usually done by repeating the calculations by Monte Carlo. The FORTRAN code provided here simplifies this problem by evaluating the significance of binary predictions for a family of ellipses which are based on confidence ellipses and cover the whole ROC space. Catalogue identifier: AERY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERY_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 11511 No. of bytes in distributed program, including test data, etc.: 72906 Distribution format: tar.gz Programming language: FORTRAN. Computer: Any computer supporting a GNU FORTRAN compiler. Operating system: Linux, MacOS, Windows. RAM: 1Mbyte Classification: 4.13, 9, 14. Nature of problem: The Receiver Operating Characteristics (ROC) is used for the evaluation of prediction methods in various disciplines like meteorology, geophysics, complex system physics, medicine etc. The estimation of the significance of a binary prediction method, however, remains a cumbersome task and is usually done by repeating the calculations by Monte Carlo. The FORTRAN code provided here simplifies this problem by evaluating the significance of binary predictions for a family of ellipses which are based on confidence ellipses and cover the whole ROC space. Solution method: Using the statistics of random binary predictions for a given value of the predictor threshold ɛt, one can construct the corresponding confidence ellipses. The envelope of these corresponding confidence ellipses is estimated when ɛt varies from 0 to 1. This way a new family of ellipses is obtained, named k-ellipses, which covers the whole ROC plane and leads to a well defined Area Under the Curve (AUC). For the latter quantity, Mason and Graham [1] have shown that it follows the Mann-Whitney U-statistics [2] which can be applied [3] for the estimation of the statistical significance of each k-ellipse. As the transformation is invertible, any point on the ROC plane corresponds to a unique value of k, thus to a unique p-value to obtain this point by chance. The present FORTRAN code provides this p-value field on the ROC plane as well as the k-ellipses corresponding to the (p=)10%, 5% and 1% significance levels using as input the number of the positive (P) and negative (Q) cases to be predicted. Unusual features: In some machines, the compiler directive -O2 or -O3 should be used to avoid NaN’s in some points of the p-field along the diagonal. Running time: Depending on the application, e.g., 4s for an Intel(R) Core(TM)2 CPU E7600 at 3.06 GHz with 2 GB RAM for the examples presented here References: [1] S.J. Mason, N.E. Graham, Quart. J. Roy. Meteor. Soc. 128 (2002) 2145. [2] H.B. Mann, D.R. Whitney, Ann. Math. Statist. 18 (1947) 50. [3] L.C. Dinneen, B.C. Blakesley, J. Roy. Stat. Soc. Ser. C Appl. Stat. 22 (1973) 269.
NASA Astrophysics Data System (ADS)
Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.
2017-06-01
The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.
NASA Astrophysics Data System (ADS)
Arttini Dwi Prasetyowati, Sri; Susanto, Adhi; Widihastuti, Ida
2017-04-01
Every noise problems require different solution. In this research, the noise that must be cancelled comes from roadway. Least Mean Square (LMS) adaptive is one of the algorithm that can be used to cancel that noise. Residual noise always appears and could not be erased completely. This research aims to know the characteristic of residual noise from vehicle’s noise and analysis so that it is no longer appearing as a problem. LMS algorithm was used to predict the vehicle’s noise and minimize the error. The distribution of the residual noise could be observed to determine the specificity of the residual noise. The statistic of the residual noise close to normal distribution with = 0,0435, = 1,13 and the autocorrelation of the residual noise forming impulse. As a conclusion the residual noise is insignificant.
Statistical variation in progressive scrambling
NASA Astrophysics Data System (ADS)
Clark, Robert D.; Fox, Peter C.
2004-07-01
The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
Uribe-Rivera, David E; Soto-Azat, Claudio; Valenzuela-Sánchez, Andrés; Bizama, Gustavo; Simonetti, Javier A; Pliscoff, Patricio
2017-07-01
Climate change is a major threat to biodiversity; the development of models that reliably predict its effects on species distributions is a priority for conservation biogeography. Two of the main issues for accurate temporal predictions from Species Distribution Models (SDM) are model extrapolation and unrealistic dispersal scenarios. We assessed the consequences of these issues on the accuracy of climate-driven SDM predictions for the dispersal-limited Darwin's frog Rhinoderma darwinii in South America. We calibrated models using historical data (1950-1975) and projected them across 40 yr to predict distribution under current climatic conditions, assessing predictive accuracy through the area under the ROC curve (AUC) and True Skill Statistics (TSS), contrasting binary model predictions against temporal-independent validation data set (i.e., current presences/absences). To assess the effects of incorporating dispersal processes we compared the predictive accuracy of dispersal constrained models with no dispersal limited SDMs; and to assess the effects of model extrapolation on the predictive accuracy of SDMs, we compared this between extrapolated and no extrapolated areas. The incorporation of dispersal processes enhanced predictive accuracy, mainly due to a decrease in the false presence rate of model predictions, which is consistent with discrimination of suitable but inaccessible habitat. This also had consequences on range size changes over time, which is the most used proxy for extinction risk from climate change. The area of current climatic conditions that was absent in the baseline conditions (i.e., extrapolated areas) represents 39% of the study area, leading to a significant decrease in predictive accuracy of model predictions for those areas. Our results highlight (1) incorporating dispersal processes can improve predictive accuracy of temporal transference of SDMs and reduce uncertainties of extinction risk assessments from global change; (2) as geographical areas subjected to novel climates are expected to arise, they must be reported as they show less accurate predictions under future climate scenarios. Consequently, environmental extrapolation and dispersal processes should be explicitly incorporated to report and reduce uncertainties in temporal predictions of SDMs, respectively. Doing so, we expect to improve the reliability of the information we provide for conservation decision makers under future climate change scenarios. © 2017 by the Ecological Society of America.
NASA Astrophysics Data System (ADS)
Sembiring, J.; Jones, F.
2018-03-01
Red cell Distribution Width (RDW) and platelet ratio (RPR) can predict liver fibrosis and cirrhosis in chronic hepatitis B with relatively high accuracy. RPR was superior to other non-invasive methods to predict liver fibrosis, such as AST and ALT ratio, AST and platelet ratio Index and FIB-4. The aim of this study was to assess diagnostic accuracy liver fibrosis by using RDW and platelets ratio in chronic hepatitis B patients based on compared with Fibroscan. This cross-sectional study was conducted at Adam Malik Hospital from January-June 2015. We examine 34 patients hepatitis B chronic, screen RDW, platelet, and fibroscan. Data were statistically analyzed. The result RPR with ROC procedure has an accuracy of 72.3% (95% CI: 84.1% - 97%). In this study, the RPR had a moderate ability to predict fibrosis degree (p = 0.029 with AUC> 70%). The cutoff value RPR was 0.0591, sensitivity and spesificity were 71.4% and 60%, Positive Prediction Value (PPV) was 55.6% and Negative Predictions Value (NPV) was 75%, positive likelihood ratio was 1.79 and negative likelihood ratio was 0.48. RPR have the ability to predict the degree of liver fibrosis in chronic hepatitis B patients with moderate accuracy.
From plant traits to plant communities: a statistical mechanistic approach to biodiversity.
Shipley, Bill; Vile, Denis; Garnier, Eric
2006-11-03
We developed a quantitative method, analogous to those used in statistical mechanics, to predict how biodiversity will vary across environments, which plant species from a species pool will be found in which relative abundances in a given environment, and which plant traits determine community assembly. This provides a scaling from plant traits to ecological communities while bypassing the complications of population dynamics. Our method treats community development as a sorting process involving species that are ecologically equivalent except with respect to particular functional traits, which leads to a constrained random assembly of species; the relative abundance of each species adheres to a general exponential distribution as a function of its traits. Using data for eight functional traits of 30 herbaceous species and community-aggregated values of these traits in 12 sites along a 42-year chronosequence of secondary succession, we predicted 94% of the variance in the relative abundances.
NASA Astrophysics Data System (ADS)
Jawitz, J. W.; Basu, N.; Chen, X.
2007-05-01
Interwell application of coupled nonreactive and reactive tracers through aquifer contaminant source zones enables quantitative characterization of aquifer heterogeneity and contaminant architecture. Parameters obtained from tracer tests are presented here in a Lagrangian framework that can be used to predict the dissolution of nonaqueous phase liquid (NAPL) contaminants. Nonreactive tracers are commonly used to provide information about travel time distributions in hydrologic systems. Reactive tracers have more recently been introduced as a tool to quantify the amount of NAPL contaminant present within the tracer swept volume. Our group has extended reactive tracer techniques to also characterize NAPL spatial distribution heterogeneity. By conceptualizing the flow field through an aquifer as a collection of streamtubes, the aquifer hydrodynamic heterogeneities may be characterized by a nonreactive tracer travel time distribution, and NAPL spatial distribution heterogeneity may be similarly described using reactive travel time distributions. The combined statistics of these distributions are used to derive a simple analytical solution for contaminant dissolution. This analytical solution, and the tracer techniques used for its parameterization, were validated both numerically and experimentally. Illustrative applications are presented from numerical simulations using the multiphase flow and transport simulator UTCHEM, and laboratory experiments of surfactant-enhanced NAPL remediation in two-dimensional flow chambers.
Theory of the Sea Ice Thickness Distribution
NASA Astrophysics Data System (ADS)
Toppaladoddi, Srikanth; Wettlaufer, J. S.
2015-10-01
We use concepts from statistical physics to transform the original evolution equation for the sea ice thickness distribution g (h ) from Thorndike et al. into a Fokker-Planck-like conservation law. The steady solution is g (h )=N (q )hqe-h /H, where q and H are expressible in terms of moments over the transition probabilities between thickness categories. The solution exhibits the functional form used in observational fits and shows that for h ≪1 , g (h ) is controlled by both thermodynamics and mechanics, whereas for h ≫1 only mechanics controls g (h ). Finally, we derive the underlying Langevin equation governing the dynamics of the ice thickness h , from which we predict the observed g (h ). The genericity of our approach provides a framework for studying the geophysical-scale structure of the ice pack using methods of broad relevance in statistical mechanics.
Theory of the Sea Ice Thickness Distribution.
Toppaladoddi, Srikanth; Wettlaufer, J S
2015-10-02
We use concepts from statistical physics to transform the original evolution equation for the sea ice thickness distribution g(h) from Thorndike et al. into a Fokker-Planck-like conservation law. The steady solution is g(h)=N(q)h(q)e(-h/H), where q and H are expressible in terms of moments over the transition probabilities between thickness categories. The solution exhibits the functional form used in observational fits and shows that for h≪1, g(h) is controlled by both thermodynamics and mechanics, whereas for h≫1 only mechanics controls g(h). Finally, we derive the underlying Langevin equation governing the dynamics of the ice thickness h, from which we predict the observed g(h). The genericity of our approach provides a framework for studying the geophysical-scale structure of the ice pack using methods of broad relevance in statistical mechanics.
Li, Longbiao
2016-01-01
In this paper, the fatigue life of fiber-reinforced ceramic-matrix composites (CMCs) with different fiber preforms, i.e., unidirectional, cross-ply, 2D (two dimensional), 2.5D and 3D CMCs at room and elevated temperatures in air and oxidative environments, has been predicted using the micromechanics approach. An effective coefficient of the fiber volume fraction along the loading direction (ECFL) was introduced to describe the fiber architecture of preforms. The statistical matrix multicracking model and fracture mechanics interface debonding criterion were used to determine the matrix crack spacing and interface debonded length. Under cyclic fatigue loading, the fiber broken fraction was determined by combining the interface wear model and fiber statistical failure model at room temperature, and interface/fiber oxidation model, interface wear model and fiber statistical failure model at elevated temperatures, based on the assumption that the fiber strength is subjected to two-parameter Weibull distribution and the load carried by broken and intact fibers satisfies the Global Load Sharing (GLS) criterion. When the broken fiber fraction approaches the critical value, the composites fatigue fracture. PMID:28773332
Statistical analysis of the calibration procedure for personnel radiation measurement instruments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bush, W.J.; Bengston, S.J.; Kalbeitzer, F.L.
1980-11-01
Thermoluminescent analyzer (TLA) calibration procedures were used to estimate personnel radiation exposure levels at the Idaho National Engineering Laboratory (INEL). A statistical analysis is presented herein based on data collected over a six month period in 1979 on four TLA's located in the Department of Energy (DOE) Radiological and Environmental Sciences Laboratory at the INEL. The data were collected according to the day-to-day procedure in effect at that time. Both gamma and beta radiation models are developed. Observed TLA readings of thermoluminescent dosimeters are correlated with known radiation levels. This correlation is then used to predict unknown radiation doses frommore » future analyzer readings of personnel thermoluminescent dosimeters. The statistical techniques applied in this analysis include weighted linear regression, estimation of systematic and random error variances, prediction interval estimation using Scheffe's theory of calibration, the estimation of the ratio of the means of two normal bivariate distributed random variables and their corresponding confidence limits according to Kendall and Stuart, tests of normality, experimental design, a comparison between instruments, and quality control.« less
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.
Verde, Pablo E
2010-12-30
In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.
Critical behavior in earthquake energy dissipation
NASA Astrophysics Data System (ADS)
Wanliss, James; Muñoz, Víctor; Pastén, Denisse; Toledo, Benjamín; Valdivia, Juan Alejandro
2017-09-01
We explore bursty multiscale energy dissipation from earthquakes flanked by latitudes 29° S and 35.5° S, and longitudes 69.501° W and 73.944° W (in the Chilean central zone). Our work compares the predictions of a theory of nonequilibrium phase transitions with nonstandard statistical signatures of earthquake complex scaling behaviors. For temporal scales less than 84 hours, time development of earthquake radiated energy activity follows an algebraic arrangement consistent with estimates from the theory of nonequilibrium phase transitions. There are no characteristic scales for probability distributions of sizes and lifetimes of the activity bursts in the scaling region. The power-law exponents describing the probability distributions suggest that the main energy dissipation takes place due to largest bursts of activity, such as major earthquakes, as opposed to smaller activations which contribute less significantly though they have greater relative occurrence. The results obtained provide statistical evidence that earthquake energy dissipation mechanisms are essentially "scale-free", displaying statistical and dynamical self-similarity. Our results provide some evidence that earthquake radiated energy and directed percolation belong to a similar universality class.
NASA Astrophysics Data System (ADS)
Sahoo, Sasmita; Jha, Madan K.
2013-12-01
The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.
NASA Technical Reports Server (NTRS)
Jeganathan, M.; Wilson, K. E.; Lesh, J. R.
1996-01-01
Uplink data from recent free-space optical communication experiments carried out between the Table Mountain Facility and the Japanese Engineering Test Satellite are used to study fluctuations caused by beam propagation through the atmosphere. The influence of atmospheric scintillation, beam wander and jitter, and multiple uplink beams on the statistics of power received by the satellite is analyzed and compared to experimental data. Preliminary analysis indicates the received signal obeys an approximate lognormal distribution, as predicted by the weak-turbulence model, but further characterization of other sources of fluctuations is necessary for accurate link predictions.
Drifter-based estimate of the 5 year dispersal of Fukushima-derived radionuclides
NASA Astrophysics Data System (ADS)
Rypina, I. I.; Jayne, S. R.; Yoshida, S.; Macdonald, A. M.; Buesseler, K.
2014-11-01
Employing some 40 years of North Pacific drifter-track observations from the Global Drifter Program database, statistics defining the horizontal spread of radionuclides from Fukushima nuclear power plant into the Pacific Ocean are investigated over a time scale of 5 years. A novel two-iteration method is employed to make the best use of the available drifter data. Drifter-based predictions of the temporal progression of the leading edge of the radionuclide distribution are compared to observed radionuclide concentrations from research surveys occupied in 2012 and 2013. Good agreement between the drifter-based predictions and the observations is found.
NASA Technical Reports Server (NTRS)
Jeganathan, M.; Wilson, K. E.; Lesh, J. R.
1996-01-01
Uplink data from recent free-space optical communication experiments carried out between the Table Mountain Facility and the Japanese Engineering Test Satellite are used to study fluctuations caused by beam propagation through the atmosphere. The influence of atmospheric scintillation, beam wander and jitter, and multiple uplink beams on the statistics of power received by the satellite is analyzed and compared to experimental data. Preliminary analysis indicates the received signal obeys an approximate lognormal distribution, as predicted by the weak-turbulence model, but further characterization of other sources of fluctuations is necessary for accurate link predictions.
Accurate low-cost methods for performance evaluation of cache memory systems
NASA Technical Reports Server (NTRS)
Laha, Subhasis; Patel, Janak H.; Iyer, Ravishankar K.
1988-01-01
Methods of simulation based on statistical techniques are proposed to decrease the need for large trace measurements and for predicting true program behavior. Sampling techniques are applied while the address trace is collected from a workload. This drastically reduces the space and time needed to collect the trace. Simulation techniques are developed to use the sampled data not only to predict the mean miss rate of the cache, but also to provide an empirical estimate of its actual distribution. Finally, a concept of primed cache is introduced to simulate large caches by the sampling-based method.
Bayesian transformation cure frailty models with multivariate failure time data.
Yin, Guosheng
2008-12-10
We propose a class of transformation cure frailty models to accommodate a survival fraction in multivariate failure time data. Established through a general power transformation, this family of cure frailty models includes the proportional hazards and the proportional odds modeling structures as two special cases. Within the Bayesian paradigm, we obtain the joint posterior distribution and the corresponding full conditional distributions of the model parameters for the implementation of Gibbs sampling. Model selection is based on the conditional predictive ordinate statistic and deviance information criterion. As an illustration, we apply the proposed method to a real data set from dentistry.
Experiment Design for Complex VTOL Aircraft with Distributed Propulsion and Tilt Wing
NASA Technical Reports Server (NTRS)
Murphy, Patrick C.; Landman, Drew
2015-01-01
Selected experimental results from a wind tunnel study of a subscale VTOL concept with distributed propulsion and tilt lifting surfaces are presented. The vehicle complexity and automated test facility were ideal for use with a randomized designed experiment. Design of Experiments and Response Surface Methods were invoked to produce run efficient, statistically rigorous regression models with minimized prediction error. Static tests were conducted at the NASA Langley 12-Foot Low-Speed Tunnel to model all six aerodynamic coefficients over a large flight envelope. This work supports investigations at NASA Langley in developing advanced configurations, simulations, and advanced control systems.
NASA Technical Reports Server (NTRS)
Parrish, R. S.; Carter, M. C.
1974-01-01
This analysis utilizes computer simulation and statistical estimation. Realizations of stationary gaussian stochastic processes with selected autocorrelation functions are computer simulated. Analysis of the simulated data revealed that the mean and the variance of a process were functionally dependent upon the autocorrelation parameter and crossing level. Using predicted values for the mean and standard deviation, by the method of moments, the distribution parameters was estimated. Thus, given the autocorrelation parameter, crossing level, mean, and standard deviation of a process, the probability of exceeding the crossing level for a particular length of time was calculated.
Coronary artery calcium distributions in older persons in the AGES-Reykjavik study
Gudmundsson, Elias Freyr; Gudnason, Vilmundur; Sigurdsson, Sigurdur; Launer, Lenore J.; Harris, Tamara B.; Aspelund, Thor
2013-01-01
Coronary Artery Calcium (CAC) is a sign of advanced atherosclerosis and an independent risk factor for cardiac events. Here, we describe CAC-distributions in an unselected aged population and compare modelling methods to characterize CAC-distribution. CAC is difficult to model because it has a skewed and zero inflated distribution with over-dispersion. Data are from the AGES-Reykjavik sample, a large population based study [2002-2006] in Iceland of 5,764 persons aged 66-96 years. Linear regressions using logarithmic- and Box-Cox transformations on CAC+1, quantile regression and a Zero-Inflated Negative Binomial model (ZINB) were applied. Methods were compared visually and with the PRESS-statistic, R2 and number of detected associations with concurrently measured variables. There were pronounced differences in CAC according to sex, age, history of coronary events and presence of plaque in the carotid artery. Associations with conventional coronary artery disease (CAD) risk factors varied between the sexes. The ZINB model provided the best results with respect to the PRESS-statistic, R2, and predicted proportion of zero scores. The ZINB model detected similar numbers of associations as the linear regression on ln(CAC+1) and usually with the same risk factors. PMID:22990371
NASA Astrophysics Data System (ADS)
Seuront, Laurent; Duponchel, Anne-Charlotte; Chapperon, Coraline
2007-11-01
The two-dimensional motion behaviour of the common intertidal gastropod Littorina littorea is investigated as a function of the immersion time from three sampling sites on an exposed rocky shore. A total of 90 individuals have been individually marked and tracked over 14 consecutive daylight low tide. Successive displacements show very intermittent behaviour, with a few localised large displacements over a wide range of small displacements. We show that successive displacements are described by flight length l d heavy-tailed distributions with P(ld)∼ld-μ. The very low values of the exponent μ ( μ≈2.22, 2.43 and 2.67) indicate that L. littorea flights fall into the category of super-diffusive processes. These exponents were significantly higher than the special value μ≈2 analytically and theoretically predicted to be the most advantageous in optimising long-term encounter statistics, especially for low-prey-density scenario. As natural selection should favour flexible behaviour, leading to different optimum searching statistics, under different conditions, our results support the idea that the differences in food concentration and distribution encountered at the different sites by L. littorea led to different heavy-tailed distributions observed for the most extreme displacements.
Improved Rubin-Bodner Model for the Prediction of Soft Tissue Deformations
Zhang, Guangming; Xia, James J.; Liebschner, Michael; Zhang, Xiaoyan; Kim, Daeseung; Zhou, Xiaobo
2016-01-01
In craniomaxillofacial (CMF) surgery, a reliable way of simulating the soft tissue deformation resulted from skeletal reconstruction is vitally important for preventing the risks of facial distortion postoperatively. However, it is difficult to simulate the soft tissue behaviors affected by different types of CMF surgery. This study presents an integrated bio-mechanical and statistical learning model to improve accuracy and reliability of predictions on soft facial tissue behavior. The Rubin-Bodner (RB) model is initially used to describe the biomechanical behavior of the soft facial tissue. Subsequently, a finite element model (FEM) computers the stress of each node in soft facial tissue mesh data resulted from bone displacement. Next, the Generalized Regression Neural Network (GRNN) method is implemented to obtain the relationship between the facial soft tissue deformation and the stress distribution corresponding to different CMF surgical types and to improve evaluation of elastic parameters included in the RB model. Therefore, the soft facial tissue deformation can be predicted by biomechanical properties and statistical model. Leave-one-out cross-validation is used on eleven patients. As a result, the average prediction error of our model (0.7035mm) is lower than those resulting from other approaches. It also demonstrates that the more accurate bio-mechanical information the model has, the better prediction performance it could achieve. PMID:27717593
Application of short-data methods on extreme surge levels
NASA Astrophysics Data System (ADS)
Feng, X.
2014-12-01
Tropical cyclone-induced storm surges are among the most destructive natural hazards that impact the United States. Unfortunately for academic research, the available time series for extreme surge analysis are very short. The limited data introduces uncertainty and affects the accuracy of statistical analyses of extreme surge levels. This study deals with techniques applicable to data sets less than 20 years, including simulation modelling and methods based on the parameters of the parent distribution. The verified water levels from water gauges spread along the Southwest and Southeast Florida Coast, as well as the Florida Keys, are used in this study. Methods to calculate extreme storm surges are described and reviewed, including 'classical' methods based on the generalized extreme value (GEV) distribution and the generalized Pareto distribution (GPD), and approaches designed specifically to deal with short data sets. Incorporating global-warming influence, the statistical analysis reveals enhanced extreme surge magnitudes and frequencies during warm years, while reduced levels of extreme surge activity are observed in the same study domain during cold years. Furthermore, a non-stationary GEV distribution is applied to predict the extreme surge levels with warming sea surface temperatures. The non-stationary GEV distribution indicates that with 1 Celsius degree warming in sea surface temperature from the baseline climate, the 100-year return surge level in Southwest and Southeast Florida will increase by up to 40 centimeters. The considered statistical approaches for extreme surge estimation based on short data sets will be valuable to coastal stakeholders, including urban planners, emergency managers, and the hurricane and storm surge forecasting and warning system.
NASA Astrophysics Data System (ADS)
Christ, John A.; Lemke, Lawrence D.; Abriola, Linda M.
2005-01-01
The influence of reduced dimensionality (two-dimensional (2-D) versus 3-D) on predictions of dense nonaqueous phase liquid (DNAPL) infiltration and entrapment in statistically homogeneous, nonuniform permeability fields was investigated using the University of Texas Chemical Compositional Simulator (UTCHEM), a 3-D numerical multiphase simulator. Hysteretic capillary pressure-saturation and relative permeability relationships implemented in UTCHEM were benchmarked against those of another lab-tested simulator, the Michigan-Vertical and Lateral Organic Redistribution (M-VALOR). Simulation of a tetrachloroethene spill in 16 field-scale aquifer realizations generated DNAPL saturation distributions with approximately equivalent distribution metrics in two and three dimensions, with 2-D simulations generally resulting in slightly higher maximum saturations and increased vertical spreading. Variability in 2-D and 3-D distribution metrics across the set of realizations was shown to be correlated at a significance level of 95-99%. Neither spill volume nor release rate appeared to affect these conclusions. Variability in the permeability field did affect spreading metrics by increasing the horizontal spreading in 3-D more than in 2-D in more heterogeneous media simulations. The assumption of isotropic horizontal spatial statistics resulted, on average, in symmetric 3-D saturation distribution metrics in the horizontal directions. The practical implication of this study is that for statistically homogeneous, nonuniform aquifers, 2-D simulations of saturation distributions are good approximations to those obtained in 3-D. However, additional work will be needed to explore the influence of dimensionality on simulated DNAPL dissolution.
scoringRules - A software package for probabilistic model evaluation
NASA Astrophysics Data System (ADS)
Lerch, Sebastian; Jordan, Alexander; Krüger, Fabian
2016-04-01
Models in the geosciences are generally surrounded by uncertainty, and being able to quantify this uncertainty is key to good decision making. Accordingly, probabilistic forecasts in the form of predictive distributions have become popular over the last decades. With the proliferation of probabilistic models arises the need for decision theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way. Various scoring rules have been developed over the past decades to address this demand. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. As such, they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This poster presents the software package scoringRules for the statistical programming language R, which contains functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. Two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, Bayesian forecasts produced via Markov Chain Monte Carlo take this form. Thereby, the scoringRules package provides a framework for generalized model evaluation that both includes Bayesian as well as classical parametric models. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices.
NASA Astrophysics Data System (ADS)
Emanuele Rizzo, Roberto; Healy, David; De Siena, Luca
2016-04-01
The success of any predictive model is largely dependent on the accuracy with which its parameters are known. When characterising fracture networks in fractured rock, one of the main issues is accurately scaling the parameters governing the distribution of fracture attributes. Optimal characterisation and analysis of fracture attributes (lengths, apertures, orientations and densities) is fundamental to the estimation of permeability and fluid flow, which are of primary importance in a number of contexts including: hydrocarbon production from fractured reservoirs; geothermal energy extraction; and deeper Earth systems, such as earthquakes and ocean floor hydrothermal venting. Our work links outcrop fracture data to modelled fracture networks in order to numerically predict bulk permeability. We collected outcrop data from a highly fractured upper Miocene biosiliceous mudstone formation, cropping out along the coastline north of Santa Cruz (California, USA). Using outcrop fracture networks as analogues for subsurface fracture systems has several advantages, because key fracture attributes such as spatial arrangements and lengths can be effectively measured only on outcrops [1]. However, a limitation when dealing with outcrop data is the relative sparseness of natural data due to the intrinsic finite size of the outcrops. We make use of a statistical approach for the overall workflow, starting from data collection with the Circular Windows Method [2]. Then we analyse the data statistically using Maximum Likelihood Estimators, which provide greater accuracy compared to the more commonly used Least Squares linear regression when investigating distribution of fracture attributes. Finally, we estimate the bulk permeability of the fractured rock mass using Oda's tensorial approach [3]. The higher quality of this statistical analysis is fundamental: better statistics of the fracture attributes means more accurate permeability estimation, since the fracture attributes feed directly into the permeability calculations. The application of Maximum Likelihood Estimators can have important consequences, especially when we aim to predict the tendency of fracture attributes towards smaller and larger scales than those observed, in order to build consistent, useable models from outcrop observations. The procedures presented here aim to understand whether the average permeability of a fracture network can be predicted, reducing its uncertainties; and if outcrop measurements of fracture attributes can be used directly to generate statistically identical fracture network models, which can then be easily up-scaled into larger areas or volumes. Gale et al. "Natural Fracture in shale: A review and new observations", AAPG Bulletin 98.11 (2014). Mauldon et al. "Circular scanlines and circular windows: new tools for characterizing the geometry of fracture traces", Journal of Structural Geology, 23 (2001). Oda "Permeability tensor for discontinuous rock masses", Geotechnique 35.4 (1985).
Inferred Lunar Boulder Distributions at Decimeter Scales
NASA Technical Reports Server (NTRS)
Baloga, S. M.; Glaze, L. S.; Spudis, P. D.
2012-01-01
Block size distributions of impact deposits on the Moon are diagnostic of the impact process and environmental effects, such as target lithology and weathering. Block size distributions are also important factors in trafficability, habitability, and possibly the identification of indigenous resources. Lunar block sizes have been investigated for many years for many purposes [e.g., 1-3]. An unresolved issue is the extent to which lunar block size distributions can be extrapolated to scales smaller than limits of resolution of direct measurement. This would seem to be a straightforward statistical application, but it is complicated by two issues. First, the cumulative size frequency distribution of observable boulders rolls over due to resolution limitations at the small end. Second, statistical regression provides the best fit only around the centroid of the data [4]. Confidence and prediction limits splay away from the best fit at the endpoints resulting in inferences in the boulder density at the CPR scale that can differ by many orders of magnitude [4]. These issues were originally investigated by Cintala and McBride [2] using Surveyor data. The objective of this study was to determine whether the measured block size distributions from Lunar Reconnaissance Orbiter Camera - Narrow Angle Camera (LROC-NAC) images (m-scale resolution) can be used to infer the block size distribution at length scales comparable to Mini-RF Circular Polarization Ratio (CPR) scales, nominally taken as 10 cm. This would set the stage for assessing correlations of inferred block size distributions with CPR returns [6].
Fixing the Big Bang Theory's Lithium Problem
NASA Astrophysics Data System (ADS)
Kohler, Susanna
2017-02-01
How did our universe come into being? The Big Bang theory is a widely accepted and highly successful cosmological model of the universe, but it does introduce one puzzle: the cosmological lithium problem. Have scientists now found a solution?Too Much LithiumIn the Big Bang theory, the universe expanded rapidly from a very high-density and high-temperature state dominated by radiation. This theory has been validated again and again: the discovery of the cosmic microwave background radiation and observations of the large-scale structure of the universe both beautifully support the Big Bang theory, for instance. But one pesky trouble-spot remains: the abundance of lithium.The arrows show the primary reactions involved in Big Bang nucleosynthesis, and their flux ratios, as predicted by the authors model, are given on the right. Synthesizing primordial elements is complicated! [Hou et al. 2017]According to Big Bang nucleosynthesis theory, primordial nucleosynthesis ran wild during the first half hour of the universes existence. This produced most of the universes helium and small amounts of other light nuclides, including deuterium and lithium.But while predictions match the observed primordial deuterium and helium abundances, Big Bang nucleosynthesis theory overpredicts the abundance of primordial lithium by about a factor of three. This inconsistency is known as the cosmological lithium problem and attempts to resolve it using conventional astrophysics and nuclear physics over the past few decades have not been successful.In a recent publicationled by Suqing Hou (Institute of Modern Physics, Chinese Academy of Sciences) and advisorJianjun He (Institute of Modern Physics National Astronomical Observatories, Chinese Academy of Sciences), however, a team of scientists has proposed an elegant solution to this problem.Time and temperature evolution of the abundances of primordial light elements during the beginning of the universe. The authors model (dotted lines) successfully predicts a lower abundance of the beryllium isotope which eventually decays into lithium relative to the classical Maxwell-Boltzmann distribution (solid lines), without changing the predicted abundances of deuterium or helium. [Hou et al. 2017]Questioning StatisticsHou and collaborators questioned a key assumption in Big Bang nucleosynthesis theory: that the nuclei involved in the process are all in thermodynamic equilibrium, and their velocities which determine the thermonuclear reaction rates are described by the classical Maxwell-Boltzmann distribution.But do nuclei still obey this classical distribution in the extremely complex, fast-expanding Big Bang hot plasma? Hou and collaborators propose that the lithium nuclei dont, and that they must instead be described by a slightly modified version of the classical distribution, accounted for using whats known as non-extensive statistics.The authors show that using the modified velocity distributions described by these statistics, they can successfully predict the observed primordial abundances of deuterium, helium, and lithium simultaneously. If this solution to the cosmological lithium problem is correct, the Big Bang theory is now one step closer to fully describing the formation of our universe.CitationS. Q. Hou et al 2017 ApJ 834 165. doi:10.3847/1538-4357/834/2/165
Quantifying the influence of sediment source area sampling on detrital thermochronometer data
NASA Astrophysics Data System (ADS)
Whipp, D. M., Jr.; Ehlers, T. A.; Coutand, I.; Bookhagen, B.
2014-12-01
Detrital thermochronology offers a unique advantage over traditional bedrock thermochronology because of its sensitivity to sediment production and transportation to sample sites. In mountainous regions, modern fluvial sediment is often collected and dated to determine the past (105 to >107 year) exhumation history of the upstream drainage area. Though potentially powerful, the interpretation of detrital thermochronometer data derived from modern fluvial sediment is challenging because of spatial and temporal variations in sediment production and transport, and target mineral concentrations. Thermochronometer age prediction models provide a quantitative basis for data interpretation, but it can be difficult to separate variations in catchment bedrock ages from the effects of variable basin denudation and sediment transport. We present two examples of quantitative data interpretation using detrital thermochronometer data from the Himalaya, focusing on the influence of spatial and temporal variations in basin denudation on predicted age distributions. We combine age predictions from the 3D thermokinematic numerical model Pecube with simple models for sediment sampling in the upstream drainage basin area to assess the influence of variations in sediment production by different geomorphic processes or scaled by topographic metrics. We first consider a small catchment from the central Himalaya where bedrock landsliding appears to have affected the observed muscovite 40Ar/39Ar age distributions. Using a simple model of random landsliding with a power-law landslide frequency-area relationship we find that the sediment residence time in the catchment has a major influence on predicted age distributions. In the second case, we compare observed detrital apatite fission-track age distributions from 16 catchments in the Bhutan Himalaya to ages predicted using Pecube and scaled by various topographic metrics. Preliminary results suggest that predicted age distributions scaled by the rock uplift rate in Pecube are statistically equivalent to the observed age distributions for ~75% of the catchments, but may improve when scaled by local relief or specific stream power weighted by satellite-derived precipitation. Ongoing work is exploring the effect of scaling by other topographic metrics.
A multiscale model for charge inversion in electric double layers
NASA Astrophysics Data System (ADS)
Mashayak, S. Y.; Aluru, N. R.
2018-06-01
Charge inversion is a widely observed phenomenon. It is a result of the rich statistical mechanics of the molecular interactions between ions, solvent, and charged surfaces near electric double layers (EDLs). Electrostatic correlations between ions and hydration interactions between ions and water molecules play a dominant role in determining the distribution of ions in EDLs. Due to highly polar nature of water, near a surface, an inhomogeneous and anisotropic arrangement of water molecules gives rise to pronounced variations in the electrostatic and hydration energies of ions. Classical continuum theories fail to accurately describe electrostatic correlations and molecular effects of water in EDLs. In this work, we present an empirical potential based quasi-continuum theory (EQT) to accurately predict the molecular-level properties of aqueous electrolytes. In EQT, we employ rigorous statistical mechanics tools to incorporate interatomic interactions, long-range electrostatics, correlations, and orientation polarization effects at a continuum-level. Explicit consideration of atomic interactions of water molecules is both theoretically and numerically challenging. We develop a systematic coarse-graining approach to coarse-grain interactions of water molecules and electrolyte ions from a high-resolution atomistic scale to the continuum scale. To demonstrate the ability of EQT to incorporate the water orientation polarization, ion hydration, and electrostatic correlations effects, we simulate confined KCl aqueous electrolyte and show that EQT can accurately predict the distribution of ions in a thin EDL and also predict the complex phenomenon of charge inversion.
Orbit and size distributions for asteroids temporarily captured by the Earth-Moon system
NASA Astrophysics Data System (ADS)
Fedorets, Grigori; Granvik, Mikael; Jedicke, Robert
2017-03-01
As a continuation of the work by Granvik et al. (2012), we expand the statistical treatment of Earth's temporarily-captured natural satellites from temporarily-captured orbiters (TCOs, i.e., objects which make at least one orbit around the Earth) to the newly redefined subpopulation of temporarily-captured flybys (TCFs). TCFs are objects that while being gravitationally bound fail to make a complete orbit around the Earth while on a geocentric orbit, but nevertheless approach the Earth within its Hill radius. We follow the trajectories of massless test asteroids through the Earth-Moon system and record the orbital characteristics of those that are temporarily captured. We then carry out a steady-state analysis utilizing the novel NEO population model by Granvik et al. (2016). We also investigate how an quadratic distribution at very small values of e⊙ and i⊙ affects the predicted population statistics of Earth's temporarily-captured natural satellites. The steady-state population in both cases (constant and quadratic number distributions inside the e and i bins) is predicted to contain a slightly reduced number of meter-sized asteroids compared to the values of the previous paper. For the combined TCO/TCF population, we find the largest body constantly present on a geocentric orbit to be on the order of 80 cm in diameter. In the phase space, where the capture is possible, the capture efficiency of TCOs and TCFs is O(10-6 -10-4) . We also find that kilometer-scale asteroids are captured once every 10 Myr.
TH-A-9A-01: Active Optical Flow Model: Predicting Voxel-Level Dose Prediction in Spine SBRT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, J; Wu, Q.J.; Yin, F
2014-06-15
Purpose: To predict voxel-level dose distribution and enable effective evaluation of cord dose sparing in spine SBRT. Methods: We present an active optical flow model (AOFM) to statistically describe cord dose variations and train a predictive model to represent correlations between AOFM and PTV contours. Thirty clinically accepted spine SBRT plans are evenly divided into training and testing datasets. The development of predictive model consists of 1) collecting a sequence of dose maps including PTV and OAR (spinal cord) as well as a set of associated PTV contours adjacent to OAR from the training dataset, 2) classifying data into fivemore » groups based on PTV's locations relative to OAR, two “Top”s, “Left”, “Right”, and “Bottom”, 3) randomly selecting a dose map as the reference in each group and applying rigid registration and optical flow deformation to match all other maps to the reference, 4) building AOFM by importing optical flow vectors and dose values into the principal component analysis (PCA), 5) applying another PCA to features of PTV and OAR contours to generate an active shape model (ASM), and 6) computing a linear regression model of correlations between AOFM and ASM.When predicting dose distribution of a new case in the testing dataset, the PTV is first assigned to a group based on its contour characteristics. Contour features are then transformed into ASM's principal coordinates of the selected group. Finally, voxel-level dose distribution is determined by mapping from the ASM space to the AOFM space using the predictive model. Results: The DVHs predicted by the AOFM-based model and those in clinical plans are comparable in training and testing datasets. At 2% volume the dose difference between predicted and clinical plans is 4.2±4.4% and 3.3±3.5% in the training and testing datasets, respectively. Conclusion: The AOFM is effective in predicting voxel-level dose distribution for spine SBRT. Partially supported by NIH/NCI under grant #R21CA161389 and a master research grant by Varian Medical System.« less
Jomah, N D; Ojo, J F; Odigie, E A; Olugasa, B O
2014-12-01
The post-civil war records of dog bite injuries (DBI) and rabies-like-illness (RLI) among humans in Liberia is a vital epidemiological resource for developing a predictive model to guide the allocation of resources towards human rabies control. Whereas DBI and RLI are high, they are largely under-reported. The objective of this study was to develop a time model of the case-pattern and apply it to derive predictors of time-trend point distribution of DBI-RLI cases. A retrospective 6 years data of DBI distribution among humans countrywide were converted to quarterly series using a transformation technique of Minimizing Squared First Difference statistic. The generated dataset was used to train a time-trend model of the DBI-RLI syndrome in Liberia. An additive detenninistic time-trend model was selected due to its performance compared to multiplication model of trend and seasonal movement. Parameter predictors were run on least square method to predict DBI cases for a prospective 4 years period, covering 2014-2017. The two-stage predictive model of DBI case-pattern between 2014 and 2017 was characterised by a uniform upward trend within Liberia's coastal and hinterland Counties over the forecast period. This paper describes a translational application of the time-trend distribution pattern of DBI epidemics, 2008-2013 reported in Liberia, on which a predictive model was developed. A computationally feasible two-stage time-trend permutation approach is proposed to estimate the time-trend parameters and conduct predictive inference on DBI-RLI in Liberia.
Soultan, Alaaeldin; Safi, Kamran
2017-01-01
Digitized species occurrence data provide an unprecedented source of information for ecologists and conservationists. Species distribution model (SDM) has become a popular method to utilise these data for understanding the spatial and temporal distribution of species, and for modelling biodiversity patterns. Our objective is to study the impact of noise in species occurrence data (namely sample size and positional accuracy) on the performance and reliability of SDM, considering the multiplicative impact of SDM algorithms, species specialisation, and grid resolution. We created a set of four 'virtual' species characterized by different specialisation levels. For each of these species, we built the suitable habitat models using five algorithms at two grid resolutions, with varying sample sizes and different levels of positional accuracy. We assessed the performance and reliability of the SDM according to classic model evaluation metrics (Area Under the Curve and True Skill Statistic) and model agreement metrics (Overall Concordance Correlation Coefficient and geographic niche overlap) respectively. Our study revealed that species specialisation had by far the most dominant impact on the SDM. In contrast to previous studies, we found that for widespread species, low sample size and low positional accuracy were acceptable, and useful distribution ranges could be predicted with as few as 10 species occurrences. Range predictions for narrow-ranged species, however, were sensitive to sample size and positional accuracy, such that useful distribution ranges required at least 20 species occurrences. Against expectations, the MAXENT algorithm poorly predicted the distribution of specialist species at low sample size.
Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T
2015-01-01
Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839
Beauregard, Frieda; de Blois, Sylvie
2014-01-01
Both climatic and edaphic conditions determine plant distribution, however many species distribution models do not include edaphic variables especially over large geographical extent. Using an exceptional database of vegetation plots (n = 4839) covering an extent of ∼55000 km2, we tested whether the inclusion of fine scale edaphic variables would improve model predictions of plant distribution compared to models using only climate predictors. We also tested how well these edaphic variables could predict distribution on their own, to evaluate the assumption that at large extents, distribution is governed largely by climate. We also hypothesized that the relative contribution of edaphic and climatic data would vary among species depending on their growth forms and biogeographical attributes within the study area. We modelled 128 native plant species from diverse taxa using four statistical model types and three sets of abiotic predictors: climate, edaphic, and edaphic-climate. Model predictive accuracy and variable importance were compared among these models and for species' characteristics describing growth form, range boundaries within the study area, and prevalence. For many species both the climate-only and edaphic-only models performed well, however the edaphic-climate models generally performed best. The three sets of predictors differed in the spatial information provided about habitat suitability, with climate models able to distinguish range edges, but edaphic models able to better distinguish within-range variation. Model predictive accuracy was generally lower for species without a range boundary within the study area and for common species, but these effects were buffered by including both edaphic and climatic predictors. The relative importance of edaphic and climatic variables varied with growth forms, with trees being more related to climate whereas lower growth forms were more related to edaphic conditions. Our study identifies the potential for non-climate aspects of the environment to pose a constraint to range expansion under climate change. PMID:24658097
Beauregard, Frieda; de Blois, Sylvie
2014-01-01
Both climatic and edaphic conditions determine plant distribution, however many species distribution models do not include edaphic variables especially over large geographical extent. Using an exceptional database of vegetation plots (n = 4839) covering an extent of ∼55,000 km2, we tested whether the inclusion of fine scale edaphic variables would improve model predictions of plant distribution compared to models using only climate predictors. We also tested how well these edaphic variables could predict distribution on their own, to evaluate the assumption that at large extents, distribution is governed largely by climate. We also hypothesized that the relative contribution of edaphic and climatic data would vary among species depending on their growth forms and biogeographical attributes within the study area. We modelled 128 native plant species from diverse taxa using four statistical model types and three sets of abiotic predictors: climate, edaphic, and edaphic-climate. Model predictive accuracy and variable importance were compared among these models and for species' characteristics describing growth form, range boundaries within the study area, and prevalence. For many species both the climate-only and edaphic-only models performed well, however the edaphic-climate models generally performed best. The three sets of predictors differed in the spatial information provided about habitat suitability, with climate models able to distinguish range edges, but edaphic models able to better distinguish within-range variation. Model predictive accuracy was generally lower for species without a range boundary within the study area and for common species, but these effects were buffered by including both edaphic and climatic predictors. The relative importance of edaphic and climatic variables varied with growth forms, with trees being more related to climate whereas lower growth forms were more related to edaphic conditions. Our study identifies the potential for non-climate aspects of the environment to pose a constraint to range expansion under climate change.
NASA Astrophysics Data System (ADS)
Wellen, Christopher; Arhonditsis, George B.; Long, Tanya; Boyd, Duncan
2014-11-01
Spatially distributed nonpoint source watershed models are essential tools to estimate the magnitude and sources of diffuse pollution. However, little work has been undertaken to understand the sources and ramifications of the uncertainty involved in their use. In this study we conduct the first Bayesian uncertainty analysis of the water quality components of the SWAT model, one of the most commonly used distributed nonpoint source models. Working in Southern Ontario, we apply three Bayesian configurations for calibrating SWAT to Redhill Creek, an urban catchment, and Grindstone Creek, an agricultural one. We answer four interrelated questions: can SWAT determine suspended sediment sources with confidence when end of basin data is used for calibration? How does uncertainty propagate from the discharge submodel to the suspended sediment submodels? Do the estimated sediment sources vary when different calibration approaches are used? Can we combine the knowledge gained from different calibration approaches? We show that: (i) despite reasonable fit at the basin outlet, the simulated sediment sources are subject to uncertainty sufficient to undermine the typical approach of reliance on a single, best fit simulation; (ii) more than a third of the uncertainty of sediment load predictions may stem from the discharge submodel; (iii) estimated sediment sources do vary significantly across the three statistical configurations of model calibration despite end-of-basin predictions being virtually identical; and (iv) Bayesian model averaging is an approach that can synthesize predictions when a number of adequate distributed models make divergent source apportionments. We conclude with recommendations for future research to reduce the uncertainty encountered when using distributed nonpoint source models for source apportionment.
A significant-loophole-free test of Bell's theorem with entangled photons
NASA Astrophysics Data System (ADS)
Giustina, Marissa; Versteegh, Marijn A. M.; Wengerowsky, Sören; Handsteiner, Johannes; Hochrainer, Armin; Phelan, Kevin; Steinlechner, Fabian; Kofler, Johannes; Larsson, Jan-Åke; Abellán, Carlos; Amaya, Waldimar; Mitchell, Morgan W.; Beyer, Jörn; Gerrits, Thomas; Lita, Adriana E.; Shalm, Lynden K.; Nam, Sae Woo; Scheidl, Thomas; Ursin, Rupert; Wittmann, Bernhard; Zeilinger, Anton
2017-10-01
John Bell's theorem of 1964 states that local elements of physical reality, existing independent of measurement, are inconsistent with the predictions of quantum mechanics (Bell, J. S. (1964), Physics (College. Park. Md). Specifically, correlations between measurement results from distant entangled systems would be smaller than predicted by quantum physics. This is expressed in Bell's inequalities. Employing modifications of Bell's inequalities, many experiments have been performed that convincingly support the quantum predictions. Yet, all experiments rely on assumptions, which provide loopholes for a local realist explanation of the measurement. Here we report an experiment with polarization-entangled photons that simultaneously closes the most significant of these loopholes. We use a highly efficient source of entangled photons, distributed these over a distance of 58.5 meters, and implemented rapid random setting generation and high-efficiency detection to observe a violation of a Bell inequality with high statistical significance. The merely statistical probability of our results to occur under local realism is less than 3.74×10-31, corresponding to an 11.5 standard deviation effect.
A cross-national analysis of how economic inequality predicts biodiversity loss.
Holland, Tim G; Peterson, Garry D; Gonzalez, Andrew
2009-10-01
We used socioeconomic models that included economic inequality to predict biodiversity loss, measured as the proportion of threatened plant and vertebrate species, across 50 countries. Our main goal was to evaluate whether economic inequality, measured as the Gini index of income distribution, improved the explanatory power of our statistical models. We compared four models that included the following: only population density, economic footprint (i.e., the size of the economy relative to the country area), economic footprint and income inequality (Gini index), and an index of environmental governance. We also tested the environmental Kuznets curve hypothesis, but it was not supported by the data. Statistical comparisons of the models revealed that the model including both economic footprint and inequality was the best predictor of threatened species. It significantly outperformed population density alone and the environmental governance model according to the Akaike information criterion. Inequality was a significant predictor of biodiversity loss and significantly improved the fit of our models. These results confirm that socioeconomic inequality is an important factor to consider when predicting rates of anthropogenic biodiversity loss.
Correlations between human mobility and social interaction reveal general activity patterns.
Mollgaard, Anders; Lehmann, Sune; Mathiesen, Joachim
2017-01-01
A day in the life of a person involves a broad range of activities which are common across many people. Going beyond diurnal cycles, a central question is: to what extent do individuals act according to patterns shared across an entire population? Here we investigate the interplay between different activity types, namely communication, motion, and physical proximity by analyzing data collected from smartphones distributed among 638 individuals. We explore two central questions: Which underlying principles govern the formation of the activity patterns? Are the patterns specific to each individual or shared across the entire population? We find that statistics of the entire population allows us to successfully predict 71% of the activity and 85% of the inactivity involved in communication, mobility, and physical proximity. Surprisingly, individual level statistics only result in marginally better predictions, indicating that a majority of activity patterns are shared across our sample population. Finally, we predict short-term activity patterns using a generalized linear model, which suggests that a simple linear description might be sufficient to explain a wide range of actions, whether they be of social or of physical character.
Renewal models and coseismic stress transfer in the Corinth Gulf, Greece, fault system
NASA Astrophysics Data System (ADS)
Console, Rodolfo; Falcone, Giuseppe; Karakostas, Vassilis; Murru, Maura; Papadimitriou, Eleftheria; Rhoades, David
2013-07-01
model interevent times and Coulomb static stress transfer on the rupture segments along the Corinth Gulf extension zone, a region with a wealth of observations on strong-earthquake recurrence behavior. From the available information on past seismic activity, we have identified eight segments without significant overlapping that are aligned along the southern boundary of the Corinth rift. We aim to test if strong earthquakes on these segments are characterized by some kind of time-predictable behavior, rather than by complete randomness. The rationale for time-predictable behavior is based on the characteristic earthquake hypothesis, the necessary ingredients of which are a known faulting geometry and slip rate. The tectonic loading rate is characterized by slip of 6 mm/yr on the westernmost fault segment, diminishing to 4 mm/yr on the easternmost segment, based on the most reliable geodetic data. In this study, we employ statistical and physical modeling to account for stress transfer among these fault segments. The statistical modeling is based on the definition of a probability density distribution of the interevent times for each segment. Both the Brownian Passage-Time (BPT) and Weibull distributions are tested. The time-dependent hazard rate thus obtained is then modified by the inclusion of a permanent physical effect due to the Coulomb static stress change caused by failure of neighboring faults since the latest characteristic earthquake on the fault of interest. The validity of the renewal model is assessed retrospectively, using the data of the last 300 years, by comparison with a plain time-independent Poisson model, by means of statistical tools including the Relative Operating Characteristic diagram, the R-score, the probability gain and the log-likelihood ratio. We treat the uncertainties in the parameters of each examined fault source, such as linear dimensions, depth of the fault center, focal mechanism, recurrence time, coseismic slip, and aperiodicity of the statistical distribution, by a Monte Carlo technique. The Monte Carlo samples for all these parameters are drawn from a uniform distribution within their uncertainty limits. We find that the BPT and the Weibull renewal models yield comparable results, and both of them perform significantly better than the Poisson hypothesis. No clear performance enhancement is achieved by the introduction of the Coulomb static stress change into the renewal model.
Image statistics underlying natural texture selectivity of neurons in macaque V4
Okazawa, Gouki; Tajima, Satohiro; Komatsu, Hidehiko
2015-01-01
Our daily visual experiences are inevitably linked to recognizing the rich variety of textures. However, how the brain encodes and differentiates a plethora of natural textures remains poorly understood. Here, we show that many neurons in macaque V4 selectively encode sparse combinations of higher-order image statistics to represent natural textures. We systematically explored neural selectivity in a high-dimensional texture space by combining texture synthesis and efficient-sampling techniques. This yielded parameterized models for individual texture-selective neurons. The models provided parsimonious but powerful predictors for each neuron’s preferred textures using a sparse combination of image statistics. As a whole population, the neuronal tuning was distributed in a way suitable for categorizing textures and quantitatively predicts human ability to discriminate textures. Together, we suggest that the collective representation of visual image statistics in V4 plays a key role in organizing the natural texture perception. PMID:25535362
Localization through surface folding in solid foams under compression.
Reis, P M; Corson, F; Boudaoud, A; Roman, B
2009-07-24
We report a combined experimental and theoretical study of the compression of a solid foam coated with a thin elastic film. Past a critical compression threshold, a pattern of localized folds emerges with a characteristic size that is imposed by an instability of the thin surface film. We perform optical surface measurements of the statistical properties of these localization zones and find that they are characterized by robust exponential tails in the strain distributions. Following a hybrid continuum and statistical approach, we develop a theory that accurately describes the nucleation and length scale of these structures and predicts the characteristic strains associated with the localized regions.
Statistical mechanics of Fermi-Pasta-Ulam chains with the canonical ensemble
NASA Astrophysics Data System (ADS)
Demirel, Melik C.; Sayar, Mehmet; Atılgan, Ali R.
1997-03-01
Low-energy vibrations of a Fermi-Pasta-Ulam-Β (FPU-Β) chain with 16 repeat units are analyzed with the aid of numerical experiments and the statistical mechanics equations of the canonical ensemble. Constant temperature numerical integrations are performed by employing the cubic coupling scheme of Kusnezov et al. [Ann. Phys. 204, 155 (1990)]. Very good agreement is obtained between numerical results and theoretical predictions for the probability distributions of the generalized coordinates and momenta both of the chain and of the thermal bath. It is also shown that the average energy of the chain scales linearly with the bath temperature.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier Science B.V. All rights reserved.
Predictions of Solar Cycle 24: How are We Doing?
NASA Technical Reports Server (NTRS)
Pesnell, William D.
2016-01-01
Predictions of solar activity are an essential part of our Space Weather forecast capability. Users are requiring usable predictions of an upcoming solar cycle to be delivered several years before solar minimum. A set of predictions of the amplitude of Solar Cycle 24 accumulated in 2008 ranged from zero to unprecedented levels of solar activity. The predictions formed an almost normal distribution, centered on the average amplitude of all preceding solar cycles. The average of the current compilation of 105 predictions of the annual-average sunspot number is 106 +/- 31, slightly lower than earlier compilations but still with a wide distribution. Solar Cycle 24 is on track to have a below-average amplitude, peaking at an annual sunspot number of about 80. Our need for solar activity predictions and our desire for those predictions to be made ever earlier in the preceding solar cycle will be discussed. Solar Cycle 24 has been a below-average sunspot cycle. There were peaks in the daily and monthly averaged sunspot number in the Northern Hemisphere in 2011 and in the Southern Hemisphere in 2014. With the rapid increase in solar data and capability of numerical models of the solar convection zone we are developing the ability to forecast the level of the next sunspot cycle. But predictions based only on the statistics of the sunspot number are not adequate for predicting the next solar maximum. I will describe how we did in predicting the amplitude of Solar Cycle 24 and describe how solar polar field predictions could be made more accurate in the future.
Gorman, Julian; Pearson, Diane; Whitehead, Peter
2008-01-01
Information on distribution and relative abundance of species is integral to sustainable management, especially if they are to be harvested for subsistence or commerce. In northern Australia, natural landscapes are vast, centers of population few, access is difficult, and Aboriginal resource centers and communities have limited funds and infrastructure. Consequently defining distribution and relative abundance by comprehensive ground survey is difficult and expensive. This highlights the need for simple, cheap, automated methodologies to predict the distribution of species in use, or having potential for use, in commercial enterprise. The technique applied here uses a Geographic Information System (GIS) to make predictions of probability of occurrence using an inductive modeling technique based on Bayes' theorem. The study area is in the Maningrida region, central Arnhem Land, in the Northern Territory, Australia. The species examined, Cycas arnhemica and Brachychiton diversifolius, are currently being 'wild harvested' in commercial trials, involving sale of decorative plants and use as carving wood, respectively. This study involved limited and relatively simple ground surveys requiring approximately 7 days of effort for each species. The overall model performance was evaluated using Cohen's kappa statistics. The predictive ability of the model for C. arnhemica was classified as moderate and for B. diversifolius as fair. The difference in model performance can be attributed to the pattern of distribution of these species. C. arnhemica tends to occur in a clumped distribution due to relatively short distance dispersal of its large seeds and vegetative growth from long-lived rhizomes, while B. diversifolius seeds are smaller and more widely dispersed across the landscape. The output from analysis predicts trends in species distribution that are consistent with independent on-site sampling for each species and therefore should prove useful in gauging the extent of resource availability. However, some caution needs to be applied as the models tend to over predict presence which is a function of distribution patterns and of other variables operating in the landscape such as fire histories which were not included in the model due to limited availability of data.
Quasi-Static Probabilistic Structural Analyses Process and Criteria
NASA Technical Reports Server (NTRS)
Goldberg, B.; Verderaime, V.
1999-01-01
Current deterministic structural methods are easily applied to substructures and components, and analysts have built great design insights and confidence in them over the years. However, deterministic methods cannot support systems risk analyses, and it was recently reported that deterministic treatment of statistical data is inconsistent with error propagation laws that can result in unevenly conservative structural predictions. Assuming non-nal distributions and using statistical data formats throughout prevailing stress deterministic processes lead to a safety factor in statistical format, which integrated into the safety index, provides a safety factor and first order reliability relationship. The embedded safety factor in the safety index expression allows a historically based risk to be determined and verified over a variety of quasi-static metallic substructures consistent with the traditional safety factor methods and NASA Std. 5001 criteria.
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2014-01-01
This Technical Publication (TP) is part 2 of a two-part study of the North Atlantic basin tropical cyclones that occurred during the weather satellite era, 1960-2013. In particular, this TP examines the inferred statistical relationships between 25 tropical cyclone parameters and 9 specific climate-related factors, including the (1) Oceanic Niño Index (ONI), (2) Southern Oscillation Index (SOI), (3) Atlantic Multidecadal Oscillation (AMO) index, (4) Quasi-Biennial Oscillation (QBO) index, (5) North Atlantic Oscillation (NAO) index of the Climate Prediction Center (CPC), (6) NAO index of the Climate Research Unit (CRU), (7) Armagh surface air temperature (ASAT), (8) Global Land-Ocean Temperature Index (GLOTI), and (9) Mauna Loa carbon dioxide (CO2) (MLCO2) index. Part 1 of this two-part study examined the statistical aspects of the 25 tropical cyclone parameters (e.g., frequencies, peak wind speed (PWS), accumulated cyclone energy (ACE), etc.) and provided the results of statistical testing (i.e., runs-testing, the t-statistic for independent samples, and Poisson distributions). Also, the study gave predictions for the frequencies of the number of tropical cyclones (NTC), number of hurricanes (NH), number of major hurricanes (NMH), and number of United States land-falling hurricanes (NUSLFH) expected for the 2014 season, based on the statistics of the overall interval 1960-2013, the subinterval 1995-2013, and whether the year 2014 would be either an El Niño year (ENY) or a non-El Niño year (NENY).
Magnification Bias in Gravitational Arc Statistics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caminha, G. B.; Estrada, J.; Makler, M.
2013-08-29
The statistics of gravitational arcs in galaxy clusters is a powerful probe of cluster structure and may provide complementary cosmological constraints. Despite recent progresses, discrepancies still remain among modelling and observations of arc abundance, specially regarding the redshift distribution of strong lensing clusters. Besides, fast "semi-analytic" methods still have to incorporate the success obtained with simulations. In this paper we discuss the contribution of the magnification in gravitational arc statistics. Although lensing conserves surface brightness, the magnification increases the signal-to-noise ratio of the arcs, enhancing their detectability. We present an approach to include this and other observational effects in semi-analyticmore » calculations for arc statistics. The cross section for arc formation ({\\sigma}) is computed through a semi-analytic method based on the ratio of the eigenvalues of the magnification tensor. Using this approach we obtained the scaling of {\\sigma} with respect to the magnification, and other parameters, allowing for a fast computation of the cross section. We apply this method to evaluate the expected number of arcs per cluster using an elliptical Navarro--Frenk--White matter distribution. Our results show that the magnification has a strong effect on the arc abundance, enhancing the fraction of arcs, moving the peak of the arc fraction to higher redshifts, and softening its decrease at high redshifts. We argue that the effect of magnification should be included in arc statistics modelling and that it could help to reconcile arcs statistics predictions with the observational data.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nose, Y.
Methods were developed for generating an integrated, statistical model of the anatomical structures within the human thorax relevant to radioisotope powered artificial heart implantation. These methods involve measurement and analysis of anatomy in four areas: chest wall, pericardium, vascular connections, and great vessels. A model for the prediction of thorax outline from radiograms was finalized. These models were combined with 100 radiograms to arrive at a size distribution representing the adult male and female populations. (CH)
Predicting Macroscale Effects Through Nanoscale Features
2012-01-01
errors become incorrectly computed by the basic OLS technique. To test for the presence of heteroscedasticity the Breusch - Pagan / Cook-Weisberg test ...is employed with the test statistics distributed as 2 with the degrees of freedom equal to the number of regressors. The Breusch - Pagan / Cook...between shock sensitivity and Sm does not exhibit any heteroscedasticity. The Breusch - Pagan / Cook-Weisberg test provides 2(1)=1.73, which
A prediction model of signal degradation in LMSS for urban areas
NASA Technical Reports Server (NTRS)
Matsudo, Takashi; Minamisono, Kenichi; Karasawa, Yoshio; Shiokawa, Takayasu
1993-01-01
A prediction model of signal degradation in a Land Mobile Satellite Service (LMSS) for urban areas is proposed. This model treats shadowing effects caused by buildings statistically and can predict a Cumulative Distribution Function (CDF) of signal diffraction losses in urban areas as a function of system parameters such as frequency and elevation angle and environmental parameters such as number of building stories and so on. In order to examine the validity of the model, we compared the percentage of locations where diffraction losses were smaller than 6 dB obtained by the CDF with satellite visibility measured by a radiometer. As a result, it was found that this proposed model is useful for estimating the feasibility of providing LMSS in urban areas.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sparks, R.B.; Aydogan, B.
In the development of new radiopharmaceuticals, animal studies are typically performed to get a first approximation of the expected radiation dose in humans. This study evaluates the performance of some commonly used data extrapolation techniques to predict residence times in humans using data collected from animals. Residence times were calculated using animal and human data, and distributions of ratios of the animal results to human results were constructed for each extrapolation method. Four methods using animal data to predict human residence times were examined: (1) using no extrapolation, (2) using relative organ mass extrapolation, (3) using physiological time extrapolation, andmore » (4) using a combination of the mass and time methods. The residence time ratios were found to be log normally distributed for the nonextrapolated and extrapolated data sets. The use of relative organ mass extrapolation yielded no statistically significant change in the geometric mean or variance of the residence time ratios as compared to using no extrapolation. Physiologic time extrapolation yielded a statistically significant improvement (p < 0.01, paired t test) in the geometric mean of the residence time ratio from 0.5 to 0.8. Combining mass and time methods did not significantly improve the results of using time extrapolation alone. 63 refs., 4 figs., 3 tabs.« less
Coherent wave transmission in quasi-one-dimensional systems with Lévy disorder
NASA Astrophysics Data System (ADS)
Amanatidis, Ilias; Kleftogiannis, Ioannis; Falceto, Fernando; Gopar, Víctor A.
2017-12-01
We study the random fluctuations of the transmission in disordered quasi-one-dimensional systems such as disordered waveguides and/or quantum wires whose random configurations of disorder are characterized by density distributions with a long tail known as Lévy distributions. The presence of Lévy disorder leads to large fluctuations of the transmission and anomalous localization, in relation to the standard exponential localization (Anderson localization). We calculate the complete distribution of the transmission fluctuations for a different number of transmission channels in the presence and absence of time-reversal symmetry. Significant differences in the transmission statistics between disordered systems with Anderson and anomalous localizations are revealed. The theoretical predictions are independently confirmed by tight-binding numerical simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shortis, M.; Johnston, G.
1997-11-01
In a previous paper, the results of photogrammetric measurements of a number of paraboloidal reflecting surfaces were presented. These results showed that photogrammetry can provide three-dimensional surface characterizations of such solar concentrators. The present paper describes the assessment of the quality of these surfaces as a derivation of the photogrammetrically produced surface coordinates. Statistical analysis of the z-coordinate distribution of errors indicates that these generally conform to a univariate Gaussian distribution, while the numerical assessment of the surface normal vectors on these surfaces indicates that the surface normal deviations appear to follow an approximately bivariate Gaussian distribution. Ray tracing ofmore » the measured surfaces to predict the expected flux distribution at the focal point of the 400 m{sup 2} dish show a close correlation with the videographically measured flux distribution at the focal point of the dish.« less
Statistics of Shared Components in Complex Component Systems
NASA Astrophysics Data System (ADS)
Mazzolini, Andrea; Gherardi, Marco; Caselle, Michele; Cosentino Lagomarsino, Marco; Osella, Matteo
2018-04-01
Many complex systems are modular. Such systems can be represented as "component systems," i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori, and their detection is often a challenge of both scientific and practical importance, requiring a clear understanding of component statistics. Importantly, some quantitative invariants appear to be common to many component systems, most notably a common broad distribution of component abundances, which often resembles the well-known Zipf's law. Such "laws" affect in a general and nontrivial way the component statistics, potentially hindering the identification of system-specific functional constraints or generative processes. Here, we specifically focus on the statistics of shared components, i.e., the distribution of the number of components shared by different system realizations, such as the common bricks found in different LEGO sets. To account for the effects of component heterogeneity, we consider a simple null model, which builds system realizations by random draws from a universe of possible components. Under general assumptions on abundance heterogeneity, we provide analytical estimates of component occurrence, which quantify exhaustively the statistics of shared components. Surprisingly, this simple null model can positively explain important features of empirical component-occurrence distributions obtained from large-scale data on bacterial genomes, LEGO sets, and book chapters. Specific architectural features and functional constraints can be detected from occurrence patterns as deviations from these null predictions, as we show for the illustrative case of the "core" genome in bacteria.
ASYMPTOTIC DISTRIBUTION OF ΔAUC, NRIs, AND IDI BASED ON THEORY OF U-STATISTICS
Demler, Olga V.; Pencina, Michael J.; Cook, Nancy R.; D’Agostino, Ralph B.
2017-01-01
The change in AUC (ΔAUC), the IDI, and NRI are commonly used measures of risk prediction model performance. Some authors have reported good validity of associated methods of estimating their standard errors (SE) and construction of confidence intervals, whereas others have questioned their performance. To address these issues we unite the ΔAUC, IDI, and three versions of the NRI under the umbrella of the U-statistics family. We rigorously show that the asymptotic behavior of ΔAUC, NRIs, and IDI fits the asymptotic distribution theory developed for U-statistics. We prove that the ΔAUC, NRIs, and IDI are asymptotically normal, unless they compare nested models under the null hypothesis. In the latter case, asymptotic normality and existing SE estimates cannot be applied to ΔAUC, NRIs, or IDI. In the former case SE formulas proposed in the literature are equivalent to SE formulas obtained from U-statistics theory if we ignore adjustment for estimated parameters. We use Sukhatme-Randles-deWet condition to determine when adjustment for estimated parameters is necessary. We show that adjustment is not necessary for SEs of the ΔAUC and two versions of the NRI when added predictor variables are significant and normally distributed. The SEs of the IDI and three-category NRI should always be adjusted for estimated parameters. These results allow us to define when existing formulas for SE estimates can be used and when resampling methods such as the bootstrap should be used instead when comparing nested models. We also use the U-statistic theory to develop a new SE estimate of ΔAUC. PMID:28627112
Asymptotic distribution of ∆AUC, NRIs, and IDI based on theory of U-statistics.
Demler, Olga V; Pencina, Michael J; Cook, Nancy R; D'Agostino, Ralph B
2017-09-20
The change in area under the curve (∆AUC), the integrated discrimination improvement (IDI), and net reclassification index (NRI) are commonly used measures of risk prediction model performance. Some authors have reported good validity of associated methods of estimating their standard errors (SE) and construction of confidence intervals, whereas others have questioned their performance. To address these issues, we unite the ∆AUC, IDI, and three versions of the NRI under the umbrella of the U-statistics family. We rigorously show that the asymptotic behavior of ∆AUC, NRIs, and IDI fits the asymptotic distribution theory developed for U-statistics. We prove that the ∆AUC, NRIs, and IDI are asymptotically normal, unless they compare nested models under the null hypothesis. In the latter case, asymptotic normality and existing SE estimates cannot be applied to ∆AUC, NRIs, or IDI. In the former case, SE formulas proposed in the literature are equivalent to SE formulas obtained from U-statistics theory if we ignore adjustment for estimated parameters. We use Sukhatme-Randles-deWet condition to determine when adjustment for estimated parameters is necessary. We show that adjustment is not necessary for SEs of the ∆AUC and two versions of the NRI when added predictor variables are significant and normally distributed. The SEs of the IDI and three-category NRI should always be adjusted for estimated parameters. These results allow us to define when existing formulas for SE estimates can be used and when resampling methods such as the bootstrap should be used instead when comparing nested models. We also use the U-statistic theory to develop a new SE estimate of ∆AUC. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Statistical time-dependent model for the interstellar gas
NASA Technical Reports Server (NTRS)
Gerola, H.; Kafatos, M.; Mccray, R.
1974-01-01
We present models for temperature and ionization structure of low, uniform-density (approximately 0.3 per cu cm) interstellar gas in a galactic disk which is exposed to soft X rays from supernova outbursts occurring randomly in space and time. The structure was calculated by computing the time record of temperature and ionization at a given point by Monte Carlo simulation. The calculation yields probability distribution functions for ionized fraction, temperature, and their various observable moments. These time-dependent models predict a bimodal temperature distribution of the gas that agrees with various observations. Cold regions in the low-density gas may have the appearance of clouds in 21-cm absorption. The time-dependent model, in contrast to the steady-state model, predicts large fluctuations in ionization rate and the existence of cold (approximately 30 K), ionized (ionized fraction equal to about 0.1) regions.
Statistical models and time series forecasting of sulfur dioxide: a case study Tehran.
Hassanzadeh, S; Hosseinibalam, F; Alizadeh, R
2009-08-01
This study performed a time-series analysis, frequency distribution and prediction of SO(2) levels for five stations (Pardisan, Vila, Azadi, Gholhak and Bahman) in Tehran for the period of 2000-2005. Most sites show a quite similar characteristic with highest pollution in autumn-winter time and least pollution in spring-summer. The frequency distributions show higher peaks at two residential sites. The potential for SO(2) problems is high because of high emissions and the close geographical proximity of the major industrial and urban centers. The ACF and PACF are nonzero for several lags, indicating a mixed (ARMA) model, then at Bahman station an ARMA model was used for forecasting SO(2). The partial autocorrelations become close to 0 after about 5 lags while the autocorrelations remain strong through all the lags shown. The results proved that ARMA (2,2) model can provides reliable, satisfactory predictions for time series.
Oblique H.F. radiowave propagation in the main trough region of the ionosphere
NASA Astrophysics Data System (ADS)
Lockwood, M.; Mitchell, V. B.
1980-12-01
The propagation of 7.335 MHz, CW signals over a 5212 km subauroral, west-east path is studied. Measurements and semiempirical predictions are made of the amplitude distributions and Doppler shifts of the received signals. The observed amplitude distribution is fitted with a numerical fading model, yielding the power losses suffered by the signals during propagation via the predominating modes. The mid-latitude trough in the F2 peak ionization density is predicted by a statistical model to be at the latitudes of this path at these times and at low K sub p values; a sharp cut-off in low-power losses at a mean K sub p of 2.75 strongly implicates the trough in the propagation of these signals. It is shown that a simple extension of this model to allow for the trough can reproduce the form of the observed diurnal variation.
Leung, Janice M; Malagoli, Andrea; Santoro, Antonella; Besutti, Giulia; Ligabue, Guido; Scaglioni, Riccardo; Dai, Darlene; Hague, Cameron; Leipsic, Jonathon; Sin, Don D.; Man, SF Paul; Guaraldi, Giovanni
2016-01-01
Background Chronic obstructive pulmonary disease (COPD) and emphysema are common amongst patients with human immunodeficiency virus (HIV). We sought to determine the clinical factors that are associated with emphysema progression in HIV. Methods 345 HIV-infected patients enrolled in an outpatient HIV metabolic clinic with ≥2 chest computed tomography scans made up the study cohort. Images were qualitatively scored for emphysema based on percentage involvement of the lung. Emphysema progression was defined as any increase in emphysema score over the study period. Univariate analyses of clinical, respiratory, and laboratory data, as well as multivariable logistic regression models, were performed to determine clinical features significantly associated with emphysema progression. Results 17.4% of the cohort were emphysema progressors. Emphysema progression was most strongly associated with having a low baseline diffusion capacity of carbon monoxide (DLCO) and having combination centrilobular and paraseptal emphysema distribution. In adjusted models, the odds ratio (OR) for emphysema progression for every 10% increase in DLCO percent predicted was 0.58 (95% confidence interval [CI] 0.41–0.81). The equivalent OR (95% CI) for centrilobular and paraseptal emphysema distribution was 10.60 (2.93–48.98). Together, these variables had an area under the curve (AUC) statistic of 0.85 for predicting emphysema progression. This was an improvement over the performance of spirometry (forced expiratory volume in 1 second to forced vital capacity ratio), which predicted emphysema progression with an AUC of only 0.65. Conclusion Combined paraseptal and centrilobular emphysema distribution and low DLCO could identify HIV patients who may experience emphysema progression. PMID:27902753
Foraging optimally for home ranges
Mitchell, Michael S.; Powell, Roger A.
2012-01-01
Economic models predict behavior of animals based on the presumption that natural selection has shaped behaviors important to an animal's fitness to maximize benefits over costs. Economic analyses have shown that territories of animals are structured by trade-offs between benefits gained from resources and costs of defending them. Intuitively, home ranges should be similarly structured, but trade-offs are difficult to assess because there are no costs of defense, thus economic models of home-range behavior are rare. We present economic models that predict how home ranges can be efficient with respect to spatially distributed resources, discounted for travel costs, under 2 strategies of optimization, resource maximization and area minimization. We show how constraints such as competitors can influence structure of homes ranges through resource depression, ultimately structuring density of animals within a population and their distribution on a landscape. We present simulations based on these models to show how they can be generally predictive of home-range behavior and the mechanisms that structure the spatial distribution of animals. We also show how contiguous home ranges estimated statistically from location data can be misleading for animals that optimize home ranges on landscapes with patchily distributed resources. We conclude with a summary of how we applied our models to nonterritorial black bears (Ursus americanus) living in the mountains of North Carolina, where we found their home ranges were best predicted by an area-minimization strategy constrained by intraspecific competition within a social hierarchy. Economic models can provide strong inference about home-range behavior and the resources that structure home ranges by offering falsifiable, a priori hypotheses that can be tested with field observations.
Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses
NASA Astrophysics Data System (ADS)
Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong
2017-04-01
Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums shows a mean improvement of more than 40% in CRPS when compared to bilinearly interpolated uncalibrated ensemble forecasts. The validation on randomly selected grid points, representing the true height distribution over Austria, still indicates a mean improvement of 35%. The applied statistical model is currently set up for 6-hourly and daily accumulation periods, but will be extended to a temporal resolution of 1-3 hours within a new probabilistic nowcasting system operated by ZAMG.
Dai, Wenrui; Xiong, Hongkai; Jiang, Xiaoqian; Chen, Chang Wen
2014-01-01
This paper proposes a novel model on intra coding for High Efficiency Video Coding (HEVC), which simultaneously predicts blocks of pixels with optimal rate distortion. It utilizes the spatial statistical correlation for the optimal prediction based on 2-D contexts, in addition to formulating the data-driven structural interdependences to make the prediction error coherent with the probability distribution, which is desirable for successful transform and coding. The structured set prediction model incorporates a max-margin Markov network (M3N) to regulate and optimize multiple block predictions. The model parameters are learned by discriminating the actual pixel value from other possible estimates to maximize the margin (i.e., decision boundary bandwidth). Compared to existing methods that focus on minimizing prediction error, the M3N-based model adaptively maintains the coherence for a set of predictions. Specifically, the proposed model concurrently optimizes a set of predictions by associating the loss for individual blocks to the joint distribution of succeeding discrete cosine transform coefficients. When the sample size grows, the prediction error is asymptotically upper bounded by the training error under the decomposable loss function. As an internal step, we optimize the underlying Markov network structure to find states that achieve the maximal energy using expectation propagation. For validation, we integrate the proposed model into HEVC for optimal mode selection on rate-distortion optimization. The proposed prediction model obtains up to 2.85% bit rate reduction and achieves better visual quality in comparison to the HEVC intra coding. PMID:25505829
NASA Astrophysics Data System (ADS)
Straub, K. M.; Ganti, V. K.; Paola, C.; Foufoula-Georgiou, E.
2010-12-01
Stratigraphy preserved in alluvial basins houses the most complete record of information necessary to reconstruct past environmental conditions. Indeed, the character of the sedimentary record is inextricably related to the surface processes that formed it. In this presentation we explore how the signals of surface processes are recorded in stratigraphy through the use of physical and numerical experiments. We focus on linking surface processes to stratigraphy in 1D by quantifying the probability distributions of processes that govern the evolution of depositional systems to the probability distribution of preserved bed thicknesses. In this study we define a bed as a package of sediment bounded above and below by erosional surfaces. In a companion presentation we document heavy-tailed statistics of erosion and deposition from high-resolution temporal elevation data recorded during a controlled physical experiment. However, the heavy tails in the magnitudes of erosional and depositional events are not preserved in the experimental stratigraphy. Similar to many bed thickness distributions reported in field studies we find that an exponential distribution adequately describes the thicknesses of beds preserved in our experiment. We explore the generation of exponential bed thickness distributions from heavy-tailed surface statistics using 1D numerical models. These models indicate that when the full distribution of elevation fluctuations (both erosional and depositional events) is symmetrical, the resulting distribution of bed thicknesses is exponential in form. Finally, we illustrate that a predictable relationship exists between the coefficient of variation of surface elevation fluctuations and the scale-parameter of the resulting exponential distribution of bed thicknesses.
Species Distribution Modelling of Aedes aegypti in two dengue-endemic regions of Pakistan.
Fatima, Syeda Hira; Atif, Salman; Rasheed, Syed Basit; Zaidi, Farrah; Hussain, Ejaz
2016-03-01
Statistical tools are effectively used to determine the distribution of mosquitoes and to make ecological inferences about the vector-borne disease dynamics. In this study, we utilised species distribution models to understand spatial patterns of Aedes aegypti in two dengue-prevalent regions of Pakistan, Lahore and Swat. Species distribution models can potentially indicate the probability of suitability of Ae. aegypti once introduced to new regions like Swat, where invasion of this species is a recent phenomenon. The distribution of Ae. aegypti was determined by applying the MaxEnt algorithm on a set of potential environmental factors and species sample records. The ecological dependency of species on each environmental variable was analysed using response curves. We quantified the statistical performance of the models based on accuracy assessment and spatial predictions. Our results suggest that Ae. aegypti is widely distributed in Lahore. Human population density and urban infrastructure are primarily responsible for greater probability of mosquito occurrence in this region. In Swat, Ae. aegypti has clumped distribution, where urban patches provide refuge to the species in an otherwise hostile heterogeneous environment and road networks are assumed to have facilitated in passive-mediated dispersal of species. In Pakistan, Ae. aegypti is expanding its range northwards; this could be associated with rapid urbanisation, trade and travel. The main implication of this expansion is that more people are at risk of dengue fever in the northern highlands of Pakistan. © 2016 John Wiley & Sons Ltd.
Analysis of vector wind change with respect to time for Cape Kennedy, Florida
NASA Technical Reports Server (NTRS)
Adelfang, S. I.
1978-01-01
Multivariate analysis was used to determine the joint distribution of the four variables represented by the components of the wind vector at an initial time and after a specified elapsed time is hypothesized to be quadravariate normal; the fourteen statistics of this distribution, calculated from 15 years of twice-daily rawinsonde data are presented by monthly reference periods for each month from 0 to 27 km. The hypotheses that the wind component changes with respect to time is univariate normal, that the joint distribution of wind component change with respect to time is univariate normal, that the joint distribution of wind component changes is bivariate normal, and that the modulus of vector wind change is Rayleigh are tested by comparison with observed distributions. Statistics of the conditional bivariate normal distributions of vector wind at a future time given the vector wind at an initial time are derived. Wind changes over time periods from 1 to 5 hours, calculated from Jimsphere data, are presented. Extension of the theoretical prediction (based on rawinsonde data) of wind component change standard deviation to time periods of 1 to 5 hours falls (with a few exceptions) within the 95 percentile confidence band of the population estimate obtained from the Jimsphere sample data. The joint distributions of wind change components, conditional wind components, and 1 km vector wind shear change components are illustrated by probability ellipses at the 95 percentile level.
Assessing the Chemical Accuracy of Protein Structures via Peptide Acidity
Anderson, Janet S.; Hernández, Griselda; LeMaster, David M.
2012-01-01
Although the protein native state is a Boltzmann conformational ensemble, practical applications often require a representative model from the most populated region of that distribution. The acidity of the backbone amides, as reflected in hydrogen exchange rates, is exquisitely sensitive to the surrounding charge and dielectric volume distribution. For each of four proteins, three independently determined X-ray structures of differing crystallographic resolution were used to predict exchange for the static solvent-exposed amide hydrogens. The average correlation coefficients range from 0.74 for ubiquitin to 0.93 for Pyrococcus furiosus rubredoxin, reflecting the larger range of experimental exchange rates exhibited by the latter protein. The exchange prediction errors modestly correlate with the crystallographic resolution. MODELLER 9v6-derived homology models at ~60% sequence identity (36% identity for chymotrypsin inhibitor CI2) yielded correlation coefficients that are ~0.1 smaller than for the cognate X-ray structures. The most recently deposited NOE-based ubiquitin structure and the original NMR structure of CI2 fail to provide statistically significant predictions of hydrogen exchange. However, the more recent RECOORD refinement study of CI2 yielded predictions comparable to the X-ray and homology model-based analyses. PMID:23182463
The utility of Bayesian predictive probabilities for interim monitoring of clinical trials
Connor, Jason T.; Ayers, Gregory D; Alvarez, JoAnn
2014-01-01
Background Bayesian predictive probabilities can be used for interim monitoring of clinical trials to estimate the probability of observing a statistically significant treatment effect if the trial were to continue to its predefined maximum sample size. Purpose We explore settings in which Bayesian predictive probabilities are advantageous for interim monitoring compared to Bayesian posterior probabilities, p-values, conditional power, or group sequential methods. Results For interim analyses that address prediction hypotheses, such as futility monitoring and efficacy monitoring with lagged outcomes, only predictive probabilities properly account for the amount of data remaining to be observed in a clinical trial and have the flexibility to incorporate additional information via auxiliary variables. Limitations Computational burdens limit the feasibility of predictive probabilities in many clinical trial settings. The specification of prior distributions brings additional challenges for regulatory approval. Conclusions The use of Bayesian predictive probabilities enables the choice of logical interim stopping rules that closely align with the clinical decision making process. PMID:24872363
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Bingxiao; Awartani, Omar; O'Connor, Brendan
2016-05-02
Large charge mobilities of semi-crystalline organic semiconducting films could be obtained by mechanically aligning the material phases of the film with the loading axis. A key element is to utilize the inherent stiffness of the material for optimal or desired alignment. However, experimentally determining the moduli of semi-crystalline organic thin films for different loading directions is difficult, if not impossible, due to film thickness and material anisotropy. In this paper, we address these challenges by presenting an approach based on combining a composite mechanics stiffness orientation formulation with a Gaussian statistical distribution to directly estimate the in-plane stiffness (transverse isotropy)more » of aligned semi-crystalline polymer films based on crystalline orientation distributions obtained by X-ray diffraction experimentally at different applied strains. Our predicted results indicate that the in-plane stiffness of an annealing film was initially isotropic, and then it evolved to transverse isotropy with increasing mechanical strains. This study underscores the significance of accounting for the crystalline orientation distributions of the film to obtain an accurate understanding and prediction of the elastic anisotropy of semi-crystalline polymer films.« less
Metabolic networks evolve towards states of maximum entropy production.
Unrean, Pornkamol; Srienc, Friedrich
2011-11-01
A metabolic network can be described by a set of elementary modes or pathways representing discrete metabolic states that support cell function. We have recently shown that in the most likely metabolic state the usage probability of individual elementary modes is distributed according to the Boltzmann distribution law while complying with the principle of maximum entropy production. To demonstrate that a metabolic network evolves towards such state we have carried out adaptive evolution experiments with Thermoanaerobacterium saccharolyticum operating with a reduced metabolic functionality based on a reduced set of elementary modes. In such reduced metabolic network metabolic fluxes can be conveniently computed from the measured metabolite secretion pattern. Over a time span of 300 generations the specific growth rate of the strain continuously increased together with a continuous increase in the rate of entropy production. We show that the rate of entropy production asymptotically approaches the maximum entropy production rate predicted from the state when the usage probability of individual elementary modes is distributed according to the Boltzmann distribution. Therefore, the outcome of evolution of a complex biological system can be predicted in highly quantitative terms using basic statistical mechanical principles. Copyright © 2011 Elsevier Inc. All rights reserved.
C-Sphere Strength-Size Scaling in a Bearing-Grade Silicon Nitride
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wereszczak, Andrew A; Jadaan, Osama M.; Kirkland, Timothy Philip
2008-01-01
A C-sphere specimen geometry was used to determine the failure strength distributions of a commercially available bearing-grade silicon nitride (Si3N4) having ball diameters of 12.7 and 25.4 mm. Strengths for both diameters were determined using the combination of failure load, C sphere geometry, and finite element analysis and fitted using two-parameter Weibull distributions. Effective areas of both diameters were estimated as a function of Weibull modulus and used to explore whether the strength distributions predictably strength-scaled between each size. They did not. That statistical observation suggested that the same flaw type did not limit the strength of both ball diametersmore » indicating a lack of material homogeneity between the two sizes. Optical fractography confirmed that. It showed there were two distinct strength-limiting flaw types in both ball diameters, that one flaw type was always associated with lower strength specimens, and that significantly higher fraction of the 24.5-mm-diameter c-sphere specimens failed from it. Predictable strength-size-scaling would therefore not result as a consequence of this because these flaw types were not homogenously distributed and sampled in both c-sphere geometries.« less
On measures of association among genetic variables
Gianola, Daniel; Manfredi, Eduardo; Simianer, Henner
2012-01-01
Summary Systems involving many variables are important in population and quantitative genetics, for example, in multi-trait prediction of breeding values and in exploration of multi-locus associations. We studied departures of the joint distribution of sets of genetic variables from independence. New measures of association based on notions of statistical distance between distributions are presented. These are more general than correlations, which are pairwise measures, and lack a clear interpretation beyond the bivariate normal distribution. Our measures are based on logarithmic (Kullback-Leibler) and on relative ‘distances’ between distributions. Indexes of association are developed and illustrated for quantitative genetics settings in which the joint distribution of the variables is either multivariate normal or multivariate-t, and we show how the indexes can be used to study linkage disequilibrium in a two-locus system with multiple alleles and present applications to systems of correlated beta distributions. Two multivariate beta and multivariate beta-binomial processes are examined, and new distributions are introduced: the GMS-Sarmanov multivariate beta and its beta-binomial counterpart. PMID:22742500
Statistical mechanics in the context of special relativity.
Kaniadakis, G
2002-11-01
In Ref. [Physica A 296, 405 (2001)], starting from the one parameter deformation of the exponential function exp(kappa)(x)=(sqrt[1+kappa(2)x(2)]+kappax)(1/kappa), a statistical mechanics has been constructed which reduces to the ordinary Boltzmann-Gibbs statistical mechanics as the deformation parameter kappa approaches to zero. The distribution f=exp(kappa)(-beta E+betamu) obtained within this statistical mechanics shows a power law tail and depends on the nonspecified parameter beta, containing all the information about the temperature of the system. On the other hand, the entropic form S(kappa)= integral d(3)p(c(kappa) f(1+kappa)+c(-kappa) f(1-kappa)), which after maximization produces the distribution f and reduces to the standard Boltzmann-Shannon entropy S0 as kappa-->0, contains the coefficient c(kappa) whose expression involves, beside the Boltzmann constant, another nonspecified parameter alpha. In the present effort we show that S(kappa) is the unique existing entropy obtained by a continuous deformation of S0 and preserving unaltered its fundamental properties of concavity, additivity, and extensivity. These properties of S(kappa) permit to determine unequivocally the values of the above mentioned parameters beta and alpha. Subsequently, we explain the origin of the deformation mechanism introduced by kappa and show that this deformation emerges naturally within the Einstein special relativity. Furthermore, we extend the theory in order to treat statistical systems in a time dependent and relativistic context. Then, we show that it is possible to determine in a self consistent scheme within the special relativity the values of the free parameter kappa which results to depend on the light speed c and reduces to zero as c--> infinity recovering in this way the ordinary statistical mechanics and thermodynamics. The statistical mechanics here presented, does not contain free parameters, preserves unaltered the mathematical and epistemological structure of the ordinary statistical mechanics and is suitable to describe a very large class of experimentally observed phenomena in low and high energy physics and in natural, economic, and social sciences. Finally, in order to test the correctness and predictability of the theory, as working example we consider the cosmic rays spectrum, which spans 13 decades in energy and 33 decades in flux, finding a high quality agreement between our predictions and observed data.
Thompson, Ronald E.; Hoffman, Scott A.
2006-01-01
A suite of 28 streamflow statistics, ranging from extreme low to high flows, was computed for 17 continuous-record streamflow-gaging stations and predicted for 20 partial-record stations in Monroe County and contiguous counties in north-eastern Pennsylvania. The predicted statistics for the partial-record stations were based on regression analyses relating inter-mittent flow measurements made at the partial-record stations indexed to concurrent daily mean flows at continuous-record stations during base-flow conditions. The same statistics also were predicted for 134 ungaged stream locations in Monroe County on the basis of regression analyses relating the statistics to GIS-determined basin characteristics for the continuous-record station drainage areas. The prediction methodology for developing the regression equations used to estimate statistics was developed for estimating low-flow frequencies. This study and a companion study found that the methodology also has application potential for predicting intermediate- and high-flow statistics. The statistics included mean monthly flows, mean annual flow, 7-day low flows for three recurrence intervals, nine flow durations, mean annual base flow, and annual mean base flows for two recurrence intervals. Low standard errors of prediction and high coefficients of determination (R2) indicated good results in using the regression equations to predict the statistics. Regression equations for the larger flow statistics tended to have lower standard errors of prediction and higher coefficients of determination (R2) than equations for the smaller flow statistics. The report discusses the methodologies used in determining the statistics and the limitations of the statistics and the equations used to predict the statistics. Caution is indicated in using the predicted statistics for small drainage area situations. Study results constitute input needed by water-resource managers in Monroe County for planning purposes and evaluation of water-resources availability.
Efficient statistical mapping of avian count data
Royle, J. Andrew; Wikle, C.K.
2005-01-01
We develop a spatial modeling framework for count data that is efficient to implement in high-dimensional prediction problems. We consider spectral parameterizations for the spatially varying mean of a Poisson model. The spectral parameterization of the spatial process is very computationally efficient, enabling effective estimation and prediction in large problems using Markov chain Monte Carlo techniques. We apply this model to creating avian relative abundance maps from North American Breeding Bird Survey (BBS) data. Variation in the ability of observers to count birds is modeled as spatially independent noise, resulting in over-dispersion relative to the Poisson assumption. This approach represents an improvement over existing approaches used for spatial modeling of BBS data which are either inefficient for continental scale modeling and prediction or fail to accommodate important distributional features of count data thus leading to inaccurate accounting of prediction uncertainty.
Cooke, David J; Michie, Christine
2010-08-01
Knowledge of group tendencies may not assist accurate predictions in the individual case. This has importance for forensic decision making and for the assessment tools routinely applied in forensic evaluations. In this article, we applied Monte Carlo methods to examine diagnostic agreement with different levels of inter-rater agreement given the distributional characteristics of PCL-R scores. Diagnostic agreement and score agreement were substantially less than expected. In addition, we examined the confidence intervals associated with individual predictions of violent recidivism. On the basis of empirical findings, statistical theory, and logic, we conclude that predictions of future offending cannot be achieved in the individual case with any degree of confidence. We discuss the problems identified in relation to the PCL-R in terms of the broader relevance to all instruments used in forensic decision making.
A compression algorithm for the combination of PDF sets.
Carrazza, Stefano; Latorre, José I; Rojo, Juan; Watt, Graeme
The current PDF4LHC recommendation to estimate uncertainties due to parton distribution functions (PDFs) in theoretical predictions for LHC processes involves the combination of separate predictions computed using PDF sets from different groups, each of which comprises a relatively large number of either Hessian eigenvectors or Monte Carlo (MC) replicas. While many fixed-order and parton shower programs allow the evaluation of PDF uncertainties for a single PDF set at no additional CPU cost, this feature is not universal, and, moreover, the a posteriori combination of the predictions using at least three different PDF sets is still required. In this work, we present a strategy for the statistical combination of individual PDF sets, based on the MC representation of Hessian sets, followed by a compression algorithm for the reduction of the number of MC replicas. We illustrate our strategy with the combination and compression of the recent NNPDF3.0, CT14 and MMHT14 NNLO PDF sets. The resulting compressed Monte Carlo PDF sets are validated at the level of parton luminosities and LHC inclusive cross sections and differential distributions. We determine that around 100 replicas provide an adequate representation of the probability distribution for the original combined PDF set, suitable for general applications to LHC phenomenology.
NASA Astrophysics Data System (ADS)
Shemer, L.; Sergeeva, A.
2009-12-01
The statistics of random water wave field determines the probability of appearance of extremely high (freak) waves. This probability is strongly related to the spectral wave field characteristics. Laboratory investigation of the spatial variation of the random wave-field statistics for various initial conditions is thus of substantial practical importance. Unidirectional nonlinear random wave groups are investigated experimentally in the 300 m long Large Wave Channel (GWK) in Hannover, Germany, which is the biggest facility of its kind in Europe. Numerous realizations of a wave field with the prescribed frequency power spectrum, yet randomly-distributed initial phases of each harmonic, were generated by a computer-controlled piston-type wavemaker. Several initial spectral shapes with identical dominant wave length but different width were considered. For each spectral shape, the total duration of sampling in all realizations was long enough to yield sufficient sample size for reliable statistics. Through all experiments, an effort had been made to retain the characteristic wave height value and thus the degree of nonlinearity of the wave field. Spatial evolution of numerous statistical wave field parameters (skewness, kurtosis and probability distributions) is studied using about 25 wave gauges distributed along the tank. It is found that, depending on the initial spectral shape, the frequency spectrum of the wave field may undergo significant modification in the course of its evolution along the tank; the values of all statistical wave parameters are strongly related to the local spectral width. A sample of the measured wave height probability functions (scaled by the variance of surface elevation) is plotted in Fig. 1 for the initially narrow rectangular spectrum. The results in Fig. 1 resemble findings obtained in [1] for the initial Gaussian spectral shape. The probability of large waves notably surpasses that predicted by the Rayleigh distribution and is the highest at the distance of about 100 m. Acknowledgement This study is carried out in the framework of the EC supported project "Transnational access to large-scale tests in the Large Wave Channel (GWK) of Forschungszentrum Küste (Contract HYDRALAB III - No. 022441). [1] L. Shemer and A. Sergeeva, J. Geophys. Res. Oceans 114, C01015 (2009). Figure 1. Variation along the tank of the measured wave height distribution for rectangular initial spectral shape, the carrier wave period T0=1.5 s.
Numerical weather prediction model tuning via ensemble prediction system
NASA Astrophysics Data System (ADS)
Jarvinen, H.; Laine, M.; Ollinaho, P.; Solonen, A.; Haario, H.
2011-12-01
This paper discusses a novel approach to tune predictive skill of numerical weather prediction (NWP) models. NWP models contain tunable parameters which appear in parameterizations schemes of sub-grid scale physical processes. Currently, numerical values of these parameters are specified manually. In a recent dual manuscript (QJRMS, revised) we developed a new concept and method for on-line estimation of the NWP model parameters. The EPPES ("Ensemble prediction and parameter estimation system") method requires only minimal changes to the existing operational ensemble prediction infra-structure and it seems very cost-effective because practically no new computations are introduced. The approach provides an algorithmic decision making tool for model parameter optimization in operational NWP. In EPPES, statistical inference about the NWP model tunable parameters is made by (i) generating each member of the ensemble of predictions using different model parameter values, drawn from a proposal distribution, and (ii) feeding-back the relative merits of the parameter values to the proposal distribution, based on evaluation of a suitable likelihood function against verifying observations. In the presentation, the method is first illustrated in low-order numerical tests using a stochastic version of the Lorenz-95 model which effectively emulates the principal features of ensemble prediction systems. The EPPES method correctly detects the unknown and wrongly specified parameters values, and leads to an improved forecast skill. Second, results with an atmospheric general circulation model based ensemble prediction system show that the NWP model tuning capacity of EPPES scales up to realistic models and ensemble prediction systems. Finally, a global top-end NWP model tuning exercise with preliminary results is published.
Spatial Distribution of Bed Particles in Natural Boulder-Bed Streams
NASA Astrophysics Data System (ADS)
Clancy, K. F.; Prestegaard, K. L.
2001-12-01
The Wolman pebble count is used to obtain the size distribution of bed particles in natural streams. Statistics such as median particle size (D50) are used in resistance calculations. Additional information such as bed particle heterogeneity may also be obtained from the particle distribution, which is used to predict sediment transport rates (Hey, 1979), (Ferguson, Prestegaard, Ashworth, 1989). Boulder-bed streams have an extreme range of particles in the particle size distribution ranging from sand size particles to particles larger than 0.5-m. A study of a natural boulder-bed reach demonstrated that the spatial distribution of the particles is a significant factor in predicting sediment transport and stream bed and bank stability. Further experiments were performed to test the limits of the spatial distribution's effect on sediment transport. Three stream reaches 40-m in length were selected with similar hydrologic characteristics and spatial distributions but varying average size particles. We used a grid 0.5 by 0.5-m and measured four particles within each grid cell. Digital photographs of the streambed were taken in each grid cell. The photographs were examined using image analysis software to obtain particle size and position of the largest particles (D84) within the reach's particle distribution. Cross section, topography and stream depth were surveyed. Velocity and velocity profiles were measured and recorded. With these data and additional surveys of bankfull floods, we tested the significance of the spatial distributions as average particle size decreases. The spatial distribution of streambed particles may provide information about stream valley formation, bank stability, sediment transport, and the growth rate of riparian vegetation.
Bird-landscape relations in the Chihuahuan Desert: Coping with uncertainties about predictive models
Gutzwiller, K.J.; Barrow, W.C.
2001-01-01
During the springs of 1995-1997, we studied birds and landscapes in the Chihuahuan Desert along part of the Texas-Mexico border. Our objectives were to assess bird-landscape relations and their interannual consistency and to identify ways to cope with associated uncertainties that undermine confidence in using such relations in conservation decision processes. Bird distributions were often significantly associated with landscape features, and many bird-landscape models were valid and useful for predictive purposes. Differences in early spring rainfall appeared to influence bird abundance, but there was no evidence that annual differences in bird abundance affected model consistency. Model consistency for richness (42%) was higher than mean model consistency for 26 focal species (mean 30%, range 0-67%), suggesting that relations involving individual species are, on average, more subject to factors that cause variation than are richness-landscape relations. Consistency of bird-landscape relations may be influenced by such factors as plant succession, exotic species invasion, bird species' tolerances for environmental variation, habitat occupancy patterns, and variation in food density or weather. The low model consistency that we observed for most species indicates the high variation in bird-landscape relations that managers and other decision makers may encounter. The uncertainty of interannual variation in bird-landscape relations can be reduced by using projections of bird distributions from different annual models to determine the likely range of temporal and spatial variation in a species' distribution. Stochastic simulation models can be used to incorporate the uncertainty of random environmental variation into predictions of bird distributions based on bird-landscape relations and to provide probabilistic projections with which managers can weigh the costs and benefits of various decisions, Uncertainty about the true structure of bird-landscape relations (structural uncertainty) can be reduced by ensuring that models meet important statistical assumptions, designing studies with sufficient statistical power, validating the predictive ability of models, and improving model accuracy through continued field sampling and model fitting. Un certainty associated with sampling variation (partial observability) can be reduced by ensuring that sample sizes are large enough to provide precise estimates of both bird and landscape parameters. By decreasing the uncertainty due to partial observability, managers will improve their ability to reduce structural uncertainty.
The Threshold Bias Model: A Mathematical Model for the Nomothetic Approach of Suicide
Folly, Walter Sydney Dutra
2011-01-01
Background Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters. Methodology/Principal Findings A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka. Conclusions/Significance The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health. PMID:21909431
Rainfall Induced Landslides in Puerto Rico (Invited)
NASA Astrophysics Data System (ADS)
Lepore, C.; Kamal, S.; Arnone, E.; Noto, V.; Shanahan, P.; Bras, R. L.
2009-12-01
Landslides are a major geologic hazard in the United States, typically triggered by rainfall, earthquakes, volcanoes and human activity. Rainfall-induced landslides are the most common type in the island of Puerto Rico, with one or two large events per year. We performed an island-wide determination of static landslide susceptibility and hazard assessment as well as dynamic modeling of rainfall-induced shallow landslides in a particular hydrologic basin. Based on statistical analysis of past landslides, we determined that reliable prediction of the susceptibility to landslides is strongly dependent on the resolution of the digital elevation model (DEM) employed and the reliability of the rainfall data. A distributed hydrology model capable of simulating landslides, tRIBS-VEGGIE, has been implemented for the first time in a humid tropical environment like Puerto Rico. The Mameyes basin, located in the Luquillo Experimental Forest in Puerto Rico, was selected for modeling based on the availability of soil, vegetation, topographical, meteorological and historic landslide data. .Application of the model yields a temporal and spatial distribution of predicted rainfall-induced landslides, which is used to predict the dynamic susceptibility of the basin to landslides.
The threshold bias model: a mathematical model for the nomothetic approach of suicide.
Folly, Walter Sydney Dutra
2011-01-01
Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters. A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka. The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health.
Sun, Jielun; Chen, S.; Rostam-Abadi, M.; Rood, M.J.
1998-01-01
A new analytical pore size distribution (PSD) model was developed to predict CH4 adsorption (storage) capacity of microporous adsorbent carbon. The model is based on a 3-D adsorption isotherm equation, derived from statistical mechanical principles. Least squares error minimization is used to solve the PSD without any pre-assumed distribution function. In comparison with several well-accepted analytical methods from the literature, this 3-D model offers relatively realistic PSD description for select reference materials, including activated carbon fibers. N2 and CH4 adsorption data were correlated using the 3-D model for commercial carbons BPL and AX-21. Predicted CH4 adsorption isotherms, based on N2 adsorption at 77 K, were in reasonable agreement with the experimental CH4 isotherms. Modeling results indicate that not all the pores contribute the same percentage Vm/Vs for CH4 storage due to different adsorbed CH4 densities. Pores near 8-9 A?? shows higher Vm/Vs on the equivalent volume basis than does larger pores.
Technical note: Combining quantile forecasts and predictive distributions of streamflows
NASA Astrophysics Data System (ADS)
Bogner, Konrad; Liechti, Katharina; Zappa, Massimiliano
2017-11-01
The enhanced availability of many different hydro-meteorological modelling and forecasting systems raises the issue of how to optimally combine this great deal of information. Especially the usage of deterministic and probabilistic forecasts with sometimes widely divergent predicted future streamflow values makes it even more complicated for decision makers to sift out the relevant information. In this study multiple streamflow forecast information will be aggregated based on several different predictive distributions, and quantile forecasts. For this combination the Bayesian model averaging (BMA) approach, the non-homogeneous Gaussian regression (NGR), also known as the ensemble model output statistic (EMOS) techniques, and a novel method called Beta-transformed linear pooling (BLP) will be applied. By the help of the quantile score (QS) and the continuous ranked probability score (CRPS), the combination results for the Sihl River in Switzerland with about 5 years of forecast data will be compared and the differences between the raw and optimally combined forecasts will be highlighted. The results demonstrate the importance of applying proper forecast combination methods for decision makers in the field of flood and water resource management.
Fisher, Moria E; Huang, Felix C; Wright, Zachary A; Patton, James L
2014-01-01
Manipulation of error feedback has been of great interest to recent studies in motor control and rehabilitation. Typically, motor adaptation is shown as a change in performance with a single scalar metric for each trial, yet such an approach might overlook details about how error evolves through the movement. We believe that statistical distributions of movement error through the extent of the trajectory can reveal unique patterns of adaption and possibly reveal clues to how the motor system processes information about error. This paper describes different possible ordinate domains, focusing on representations in time and state-space, used to quantify reaching errors. We hypothesized that the domain with the lowest amount of variability would lead to a predictive model of reaching error with the highest accuracy. Here we showed that errors represented in a time domain demonstrate the least variance and allow for the highest predictive model of reaching errors. These predictive models will give rise to more specialized methods of robotic feedback and improve previous techniques of error augmentation.
Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.
Risk prediction models of breast cancer: a systematic review of model performances.
Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin
2012-05-01
The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
Predicting weak lensing statistics from halo mass reconstructions - Final Paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Everett, Spencer
2015-08-20
As dark matter does not absorb or emit light, its distribution in the universe must be inferred through indirect effects such as the gravitational lensing of distant galaxies. While most sources are only weakly lensed, the systematic alignment of background galaxies around a foreground lens can constrain the mass of the lens which is largely in the form of dark matter. In this paper, I have implemented a framework to reconstruct all of the mass along lines of sight using a best-case dark matter halo model in which the halo mass is known. This framework is then used to makemore » predictions of the weak lensing of 3,240 generated source galaxies through a 324 arcmin² field of the Millennium Simulation. The lensed source ellipticities are characterized by the ellipticity-ellipticity and galaxy-mass correlation functions and compared to the same statistic for the intrinsic and ray-traced ellipticities. In the ellipticity-ellipticity correlation function, I and that the framework systematically under predicts the shear power by an average factor of 2.2 and fails to capture correlation from dark matter structure at scales larger than 1 arcminute. The model predicted galaxy-mass correlation function is in agreement with the ray-traced statistic from scales 0.2 to 0.7 arcminutes, but systematically underpredicts shear power at scales larger than 0.7 arcminutes by an average factor of 1.2. Optimization of the framework code has reduced the mean CPU time per lensing prediction by 70% to 24 ± 5 ms. Physical and computational shortcomings of the framework are discussed, as well as potential improvements for upcoming work.« less
Olson, Andrew; Halloran, Elizabeth; Romani, Cristina
2015-12-01
We present three jargonaphasic patients who made phonological errors in naming, repetition and reading. We analyse target/response overlap using statistical models to answer three questions: 1) Is there a single phonological source for errors or two sources, one for target-related errors and a separate source for abstruse errors? 2) Can correct responses be predicted by the same distribution used to predict errors or do they show a completion boost (CB)? 3) Is non-lexical and lexical information summed during reading and repetition? The answers were clear. 1) Abstruse errors did not require a separate distribution created by failure to access word forms. Abstruse and target-related errors were the endpoints of a single overlap distribution. 2) Correct responses required a special factor, e.g., a CB or lexical/phonological feedback, to preserve their integrity. 3) Reading and repetition required separate lexical and non-lexical contributions that were combined at output. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Schemm, J. E.; Long, L.; Baxter, S.
2013-12-01
Evaluation of the NCEP CFSv2 45-day Forecasts for Predictability of Intraseasonal Tropical Storm Activities Jae-Kyung E. Schemm, Lindsey Long and Stephen Baxter Climate Prediction Center, NCEP/NWS/NOAA Predictability of intraseasonal tropical storm (TS) activities is assessed using the 1999-2010 CFSv2 hindcast suite. Weekly TS activities in the CFSv2 45-day forecasts were determined using the TS detection and tracking method devised by Carmago and Zebiak (2002). The forecast periods are divided into weekly intervals for Week 1 through Week 6, and also the 30-day mean. The TS activities in those intervals are compared to the observed activities based on the NHC HURDAT and JTWC Best Track datasets. The CFSv2 45-day hindcast suite is made of forecast runs initialized at 00, 06, 12 and 18Z every day during the 1999 - 2010 period. For predictability evaluation, forecast TS activities are analyzed based on 20-member ensemble forecasts comprised of 45-day runs made during the most recent 5 days prior to the verification period. The forecast TS activities are evaluated in terms of the number of storms, genesis locations and storm tracks during the weekly periods. The CFSv2 forecasts are shown to have a fair level of skill in predicting the number of storms over the Atlantic Basin with the temporal correlation scores ranging from 0.73 for Week 1 forecasts to 0.63 for Week 6, and the average RMS errors ranging from 0.86 to 1.07 during the 1999-2010 hurricane season. Also, the forecast track density distribution and false alarm statistics are compiled using the hindcast analyses. In real-time applications of the intraseasonal TS activity forecasts, the climatological TS forecast statistics will be used to make the model bias corrections in terms of the storm counts, track distribution and removal of false alarms. An operational implementation of the weekly TS activity prediction is planned for early 2014 to provide an objective input for the CPC's Global Tropical Hazards Outlooks.
Hope, Andrew G.; Waltari, Eric; Malaney, Jason L.; Payer, David C.; Cook, J.A.; Talbot, Sandra L.
2015-01-01
As ancestral biodiversity responded dynamically to late-Quaternary climate changes, so are extant organisms responding to the warming trajectory of the Anthropocene. Ecological predictive modeling, statistical hypothesis tests, and genetic signatures of demographic change can provide a powerful integrated toolset for investigating these biodiversity responses to climate change, and relative resiliency across different communities. Within the biotic province of Beringia, we analyzed specimen localities and DNA sequences from 28 mammal species associated with boreal forest and Arctic tundra biomes to assess both historical distributional and evolutionary responses and then forecasted future changes based on statistical assessments of past and present trajectories, and quantified distributional and demographic changes in relation to major management regions within the study area. We addressed three sets of hypotheses associated with aspects of methodological, biological, and socio-political importance by asking (1) what is the consistency among implications of predicted changes based on the results of both ecological and evolutionary analyses; (2) what are the ecological and evolutionary implications of climate change considering either total regional diversity or distinct communities associated with major biomes; and (3) are there differences in management implications across regions? Our results indicate increasing Arctic richness through time that highlights a potential state shift across the Arctic landscape. However, within distinct ecological communities, we found a predicted decline in the range and effective population size of tundra species into several discrete refugial areas. Consistency in results based on a combination of both ecological and evolutionary approaches demonstrates increased statistical confidence by applying cross-discipline comparative analyses to conservation of biodiversity, particularly considering variable management regimes that seek to balance sustainable ecosystems with other anthropogenic values. Refugial areas for cold-adapted taxa appear to be persistent across both warm and cold climate phases and although fragmented, constitute vital regions for persistence of Arctic mammals.
Higgs and superparticle mass predictions from the landscape
NASA Astrophysics Data System (ADS)
Baer, Howard; Barger, Vernon; Serce, Hasan; Sinha, Kuver
2018-03-01
Predictions for the scale of SUSY breaking from the string landscape go back at least a decade to the work of Denef and Douglas on the statistics of flux vacua. The assumption that an assortment of SUSY breaking F and D terms are present in the hidden sector, and their values are uniformly distributed in the landscape of D = 4, N = 1 effective supergravity models, leads to the expectation that the landscape pulls towards large values of soft terms favored by a power law behavior P( m soft) ˜ m soft n . On the other hand, similar to Weinberg's prediction of the cosmological constant, one can assume an anthropic selection of weak scales not too far from the measured value characterized by m W,Z,h ˜ 100 GeV. Working within a fertile patch of gravity-mediated low energy effective theories where the superpotential μ term is ≪ m 3/2, as occurs in models such as radiative breaking of Peccei-Quinn symmetry, this biases statistical distributions on the landscape by a cutoff on the parameter ΔEW, which measures fine-tuning in the m Z - μ mass relation. The combined effect of statistical and anthropic pulls turns out to favor low energy phenomenology that is more or less agnostic to UV physics. While a uniform selection n = 0 of soft terms produces too low a value for m h , taking n = 1 and 2 produce most probabilistically m h ˜ 125 GeV for negative trilinear terms. For n ≥ 1, there is a pull towards split generations with {m}_{\\tilde{q},\\tilde{ℓ}}(1,2)˜ 10-30 TeV whilst {m}_{{\\tilde{t}}_1}˜ 1-2 TeV . The most probable gluino mass comes in at ˜ 3 - 4 TeV — apparently beyond the reach of HL-LHC (although the required quasi-degenerate higgsinos should still be within reach). We comment on consequences for SUSY collider and dark matter searches.
On the Origins of Suboptimality in Human Probabilistic Inference
Acerbi, Luigi; Vijayakumar, Sethu; Wolpert, Daniel M.
2014-01-01
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior. PMID:24945142
Exploring Explanations of Subglacial Bedform Sizes Using Statistical Models.
Hillier, John K; Kougioumtzoglou, Ioannis A; Stokes, Chris R; Smith, Michael J; Clark, Chris D; Spagnolo, Matteo S
2016-01-01
Sediments beneath modern ice sheets exert a key control on their flow, but are largely inaccessible except through geophysics or boreholes. In contrast, palaeo-ice sheet beds are accessible, and typically characterised by numerous bedforms. However, the interaction between bedforms and ice flow is poorly constrained and it is not clear how bedform sizes might reflect ice flow conditions. To better understand this link we present a first exploration of a variety of statistical models to explain the size distribution of some common subglacial bedforms (i.e., drumlins, ribbed moraine, MSGL). By considering a range of models, constructed to reflect key aspects of the physical processes, it is possible to infer that the size distributions are most effectively explained when the dynamics of ice-water-sediment interaction associated with bedform growth is fundamentally random. A 'stochastic instability' (SI) model, which integrates random bedform growth and shrinking through time with exponential growth, is preferred and is consistent with other observations of palaeo-bedforms and geophysical surveys of active ice sheets. Furthermore, we give a proof-of-concept demonstration that our statistical approach can bridge the gap between geomorphological observations and physical models, directly linking measurable size-frequency parameters to properties of ice sheet flow (e.g., ice velocity). Moreover, statistically developing existing models as proposed allows quantitative predictions to be made about sizes, making the models testable; a first illustration of this is given for a hypothesised repeat geophysical survey of bedforms under active ice. Thus, we further demonstrate the potential of size-frequency distributions of subglacial bedforms to assist the elucidation of subglacial processes and better constrain ice sheet models.