Sample records for simple statistical model

  1. Comparing and combining process-based crop models and statistical models with some implications for climate change

    NASA Astrophysics Data System (ADS)

    Roberts, Michael J.; Braun, Noah O.; Sinclair, Thomas R.; Lobell, David B.; Schlenker, Wolfram

    2017-09-01

    We compare predictions of a simple process-based crop model (Soltani and Sinclair 2012), a simple statistical model (Schlenker and Roberts 2009), and a combination of both models to actual maize yields on a large, representative sample of farmer-managed fields in the Corn Belt region of the United States. After statistical post-model calibration, the process model (Simple Simulation Model, or SSM) predicts actual outcomes slightly better than the statistical model, but the combined model performs significantly better than either model. The SSM, statistical model and combined model all show similar relationships with precipitation, while the SSM better accounts for temporal patterns of precipitation, vapor pressure deficit and solar radiation. The statistical and combined models show a more negative impact associated with extreme heat for which the process model does not account. Due to the extreme heat effect, predicted impacts under uniform climate change scenarios are considerably more severe for the statistical and combined models than for the process-based model.

  2. New approach in the quantum statistical parton distribution

    NASA Astrophysics Data System (ADS)

    Sohaily, Sozha; Vaziri (Khamedi), Mohammad

    2017-12-01

    An attempt to find simple parton distribution functions (PDFs) based on quantum statistical approach is presented. The PDFs described by the statistical model have very interesting physical properties which help to understand the structure of partons. The longitudinal portion of distribution functions are given by applying the maximum entropy principle. An interesting and simple approach to determine the statistical variables exactly without fitting and fixing parameters is surveyed. Analytic expressions of the x-dependent PDFs are obtained in the whole x region [0, 1], and the computed distributions are consistent with the experimental observations. The agreement with experimental data, gives a robust confirm of our simple presented statistical model.

  3. A simple rain attenuation model for earth-space radio links operating at 10-35 GHz

    NASA Technical Reports Server (NTRS)

    Stutzman, W. L.; Yon, K. M.

    1986-01-01

    The simple attenuation model has been improved from an earlier version and now includes the effect of wave polarization. The model is for the prediction of rain attenuation statistics on earth-space communication links operating in the 10-35 GHz band. Simple calculations produce attenuation values as a function of average rain rate. These together with rain rate statistics (either measured or predicted) can be used to predict annual rain attenuation statistics. In this paper model predictions are compared to measured data from a data base of 62 experiments performed in the U.S., Europe, and Japan. Comparisons are also made to predictions from other models.

  4. Prediction of drug transport processes using simple parameters and PLS statistics. The use of ACD/logP and ACD/ChemSketch descriptors.

    PubMed

    Osterberg, T; Norinder, U

    2001-01-01

    A method of modelling and predicting biopharmaceutical properties using simple theoretically computed molecular descriptors and multivariate statistics has been investigated for several data sets related to solubility, IAM chromatography, permeability across Caco-2 cell monolayers, human intestinal perfusion, brain-blood partitioning, and P-glycoprotein ATPase activity. The molecular descriptors (e.g. molar refractivity, molar volume, index of refraction, surface tension and density) and logP were computed with ACD/ChemSketch and ACD/logP, respectively. Good statistical models were derived that permit simple computational prediction of biopharmaceutical properties. All final models derived had R(2) values ranging from 0.73 to 0.95 and Q(2) values ranging from 0.69 to 0.86. The RMSEP values for the external test sets ranged from 0.24 to 0.85 (log scale).

  5. Seasonal ENSO forecasting: Where does a simple model stand amongst other operational ENSO models?

    NASA Astrophysics Data System (ADS)

    Halide, Halmar

    2017-01-01

    We apply a simple linear multiple regression model called IndOzy for predicting ENSO up to 7 seasonal lead times. The model still used 5 (five) predictors of the past seasonal Niño 3.4 ENSO indices derived from chaos theory and it was rolling-validated to give a one-step ahead forecast. The model skill was evaluated against data from the season of May-June-July (MJJ) 2003 to November-December-January (NDJ) 2015/2016. There were three skill measures such as: Pearson correlation, RMSE, and Euclidean distance were used for forecast verification. The skill of this simple model was than compared to those of combined Statistical and Dynamical models compiled at the IRI (International Research Institute) website. It was found that the simple model was only capable of producing a useful ENSO prediction only up to 3 seasonal leads, while the IRI statistical and Dynamical model skill were still useful up to 4 and 6 seasonal leads, respectively. Even with its short-range seasonal prediction skills, however, the simple model still has a potential to give ENSO-derived tailored products such as probabilistic measures of precipitation and air temperature. Both meteorological conditions affect the presence of wild-land fire hot-spots in Sumatera and Kalimantan. It is suggested that to improve its long-range skill, the simple INDOZY model needs to incorporate a nonlinear model such as an artificial neural network technique.

  6. Detection of outliers in the response and explanatory variables of the simple circular regression model

    NASA Astrophysics Data System (ADS)

    Mahmood, Ehab A.; Rana, Sohel; Hussin, Abdul Ghapor; Midi, Habshah

    2016-06-01

    The circular regression model may contain one or more data points which appear to be peculiar or inconsistent with the main part of the model. This may be occur due to recording errors, sudden short events, sampling under abnormal conditions etc. The existence of these data points "outliers" in the data set cause lot of problems in the research results and the conclusions. Therefore, we should identify them before applying statistical analysis. In this article, we aim to propose a statistic to identify outliers in the both of the response and explanatory variables of the simple circular regression model. Our proposed statistic is robust circular distance RCDxy and it is justified by the three robust measurements such as proportion of detection outliers, masking and swamping rates.

  7. Statistical bias correction method applied on CMIP5 datasets over the Indian region during the summer monsoon season for climate change applications

    NASA Astrophysics Data System (ADS)

    Prasanna, V.

    2018-01-01

    This study makes use of temperature and precipitation from CMIP5 climate model output for climate change application studies over the Indian region during the summer monsoon season (JJAS). Bias correction of temperature and precipitation from CMIP5 GCM simulation results with respect to observation is discussed in detail. The non-linear statistical bias correction is a suitable bias correction method for climate change data because it is simple and does not add up artificial uncertainties to the impact assessment of climate change scenarios for climate change application studies (agricultural production changes) in the future. The simple statistical bias correction uses observational constraints on the GCM baseline, and the projected results are scaled with respect to the changing magnitude in future scenarios, varying from one model to the other. Two types of bias correction techniques are shown here: (1) a simple bias correction using a percentile-based quantile-mapping algorithm and (2) a simple but improved bias correction method, a cumulative distribution function (CDF; Weibull distribution function)-based quantile-mapping algorithm. This study shows that the percentile-based quantile mapping method gives results similar to the CDF (Weibull)-based quantile mapping method, and both the methods are comparable. The bias correction is applied on temperature and precipitation variables for present climate and future projected data to make use of it in a simple statistical model to understand the future changes in crop production over the Indian region during the summer monsoon season. In total, 12 CMIP5 models are used for Historical (1901-2005), RCP4.5 (2005-2100), and RCP8.5 (2005-2100) scenarios. The climate index from each CMIP5 model and the observed agricultural yield index over the Indian region are used in a regression model to project the changes in the agricultural yield over India from RCP4.5 and RCP8.5 scenarios. The results revealed a better convergence of model projections in the bias corrected data compared to the uncorrected data. The study can be extended to localized regional domains aimed at understanding the changes in the agricultural productivity in the future with an agro-economy or a simple statistical model. The statistical model indicated that the total food grain yield is going to increase over the Indian region in the future, the increase in the total food grain yield is approximately 50 kg/ ha for the RCP4.5 scenario from 2001 until the end of 2100, and the increase in the total food grain yield is approximately 90 kg/ha for the RCP8.5 scenario from 2001 until the end of 2100. There are many studies using bias correction techniques, but this study applies the bias correction technique to future climate scenario data from CMIP5 models and applied it to crop statistics to find future crop yield changes over the Indian region.

  8. Modelling unsupervised online-learning of artificial grammars: linking implicit and statistical learning.

    PubMed

    Rohrmeier, Martin A; Cross, Ian

    2014-07-01

    Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. Computational Modeling of Statistical Learning: Effects of Transitional Probability versus Frequency and Links to Word Learning

    ERIC Educational Resources Information Center

    Mirman, Daniel; Estes, Katharine Graf; Magnuson, James S.

    2010-01-01

    Statistical learning mechanisms play an important role in theories of language acquisition and processing. Recurrent neural network models have provided important insights into how these mechanisms might operate. We examined whether such networks capture two key findings in human statistical learning. In Simulation 1, a simple recurrent network…

  10. Rainfall runoff modelling of the Upper Ganga and Brahmaputra basins using PERSiST.

    PubMed

    Futter, M N; Whitehead, P G; Sarkar, S; Rodda, H; Crossman, J

    2015-06-01

    There are ongoing discussions about the appropriate level of complexity and sources of uncertainty in rainfall runoff models. Simulations for operational hydrology, flood forecasting or nutrient transport all warrant different levels of complexity in the modelling approach. More complex model structures are appropriate for simulations of land-cover dependent nutrient transport while more parsimonious model structures may be adequate for runoff simulation. The appropriate level of complexity is also dependent on data availability. Here, we use PERSiST; a simple, semi-distributed dynamic rainfall-runoff modelling toolkit to simulate flows in the Upper Ganges and Brahmaputra rivers. We present two sets of simulations driven by single time series of daily precipitation and temperature using simple (A) and complex (B) model structures based on uniform and hydrochemically relevant land covers respectively. Models were compared based on ensembles of Bayesian Information Criterion (BIC) statistics. Equifinality was observed for parameters but not for model structures. Model performance was better for the more complex (B) structural representations than for parsimonious model structures. The results show that structural uncertainty is more important than parameter uncertainty. The ensembles of BIC statistics suggested that neither structural representation was preferable in a statistical sense. Simulations presented here confirm that relatively simple models with limited data requirements can be used to credibly simulate flows and water balance components needed for nutrient flux modelling in large, data-poor basins.

  11. Statistical Mechanics of the US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward D.; Broedersz, Chase P.; Bialek, William

    2015-07-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The maximum entropy model consistent with the observed pairwise correlations among justices' votes, an Ising spin glass, agrees quantitatively with the data. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering the intuition that ideologically opposite justices negatively influence each another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, organizing the voting patterns in a relatively simple "energy landscape." Besides unanimity, other energy minima in this landscape, or maxima in probability, correspond to prototypical voting states, such as the ideological split or a tightly correlated, conservative core. The model correctly predicts the correlation of justices with the majority and gives us a measure of their influence on the majority decision. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context.

  12. Are V1 Simple Cells Optimized for Visual Occlusions? A Comparative Study

    PubMed Central

    Bornschein, Jörg; Henniges, Marc; Lücke, Jörg

    2013-01-01

    Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. PMID:23754938

  13. A simple statistical model for geomagnetic reversals

    NASA Technical Reports Server (NTRS)

    Constable, Catherine

    1990-01-01

    The diversity of paleomagnetic records of geomagnetic reversals now available indicate that the field configuration during transitions cannot be adequately described by simple zonal or standing field models. A new model described here is based on statistical properties inferred from the present field and is capable of simulating field transitions like those observed. Some insight is obtained into what one can hope to learn from paleomagnetic records. In particular, it is crucial that the effects of smoothing in the remanence acquisition process be separated from true geomagnetic field behavior. This might enable us to determine the time constants associated with the dominant field configuration during a reversal.

  14. Large ensemble modeling of the last deglacial retreat of the West Antarctic Ice Sheet: comparison of simple and advanced statistical techniques

    NASA Astrophysics Data System (ADS)

    Pollard, David; Chang, Won; Haran, Murali; Applegate, Patrick; DeConto, Robert

    2016-05-01

    A 3-D hybrid ice-sheet model is applied to the last deglacial retreat of the West Antarctic Ice Sheet over the last ˜ 20 000 yr. A large ensemble of 625 model runs is used to calibrate the model to modern and geologic data, including reconstructed grounding lines, relative sea-level records, elevation-age data and uplift rates, with an aggregate score computed for each run that measures overall model-data misfit. Two types of statistical methods are used to analyze the large-ensemble results: simple averaging weighted by the aggregate score, and more advanced Bayesian techniques involving Gaussian process-based emulation and calibration, and Markov chain Monte Carlo. The analyses provide sea-level-rise envelopes with well-defined parametric uncertainty bounds, but the simple averaging method only provides robust results with full-factorial parameter sampling in the large ensemble. Results for best-fit parameter ranges and envelopes of equivalent sea-level rise with the simple averaging method agree well with the more advanced techniques. Best-fit parameter ranges confirm earlier values expected from prior model tuning, including large basal sliding coefficients on modern ocean beds.

  15. Accurate Modeling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron; Scoccimarro, Roman

    2015-01-01

    The large-scale distribution of galaxies can be explained fairly simply by assuming (i) a cosmological model, which determines the dark matter halo distribution, and (ii) a simple connection between galaxies and the halos they inhabit. This conceptually simple framework, called the halo model, has been remarkably successful at reproducing the clustering of galaxies on all scales, as observed in various galaxy redshift surveys. However, none of these previous studies have carefully modeled the systematics and thus truly tested the halo model in a statistically rigorous sense. We present a new accurate and fully numerical halo model framework and test it against clustering measurements from two luminosity samples of galaxies drawn from the SDSS DR7. We show that the simple ΛCDM cosmology + halo model is not able to simultaneously reproduce the galaxy projected correlation function and the group multiplicity function. In particular, the more luminous sample shows significant tension with theory. We discuss the implications of our findings and how this work paves the way for constraining galaxy formation by accurate simultaneous modeling of multiple galaxy clustering statistics.

  16. A Role for Chunk Formation in Statistical Learning of Second Language Syntax

    ERIC Educational Resources Information Center

    Hamrick, Phillip

    2014-01-01

    Humans are remarkably sensitive to the statistical structure of language. However, different mechanisms have been proposed to account for such statistical sensitivities. The present study compared adult learning of syntax and the ability of two models of statistical learning to simulate human performance: Simple Recurrent Networks, which learn by…

  17. Are Statisticians Cold-Blooded Bosses? A New Perspective on the "Old" Concept of Statistical Population

    ERIC Educational Resources Information Center

    Lu, Yonggang; Henning, Kevin S. S.

    2013-01-01

    Spurred by recent writings regarding statistical pragmatism, we propose a simple, practical approach to introducing students to a new style of statistical thinking that models nature through the lens of data-generating processes, not populations. (Contains 5 figures.)

  18. Statistical modeling of ecosystem respiration using eddy covariance data: Maximum likelihood parameter estimation, and Monte Carlo simulation of model and parameter uncertainty, applied to three simple models

    Treesearch

    Andrew D. Richardson; David Y. Hollinger; David Y. Hollinger

    2005-01-01

    Whether the goal is to fill gaps in the flux record, or to extract physiological parameters from eddy covariance data, researchers are frequently interested in fitting simple models of ecosystem physiology to measured data. Presently, there is no consensus on the best models to use, or the ideal optimization criteria. We demonstrate that, given our estimates of the...

  19. Assistive Technologies for Second-Year Statistics Students Who Are Blind

    ERIC Educational Resources Information Center

    Erhardt, Robert J.; Shuman, Michael P.

    2015-01-01

    At Wake Forest University, a student who is blind enrolled in a second course in statistics. The course covered simple and multiple regression, model diagnostics, model selection, data visualization, and elementary logistic regression. These topics required that the student both interpret and produce three sets of materials: mathematical writing,…

  20. Model for neural signaling leap statistics

    NASA Astrophysics Data System (ADS)

    Chevrollier, Martine; Oriá, Marcos

    2011-03-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T = 37.5°C, awaken regime) and Lévy statistics (T = 35.5°C, sleeping period), characterized by rare events of long range connections.

  1. Probability, statistics, and computational science.

    PubMed

    Beerenwinkel, Niko; Siebourg, Juliane

    2012-01-01

    In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.

  2. Statistical validity of using ratio variables in human kinetics research.

    PubMed

    Liu, Yuanlong; Schutz, Robert W

    2003-09-01

    The purposes of this study were to investigate the validity of the simple ratio and three alternative deflation models and examine how the variation of the numerator and denominator variables affects the reliability of a ratio variable. A simple ratio and three alternative deflation models were fitted to four empirical data sets, and common criteria were applied to determine the best model for deflation. Intraclass correlation was used to examine the component effect on the reliability of a ratio variable. The results indicate that the validity, of a deflation model depends on the statistical characteristics of the particular component variables used, and an optimal deflation model for all ratio variables may not exist. Therefore, it is recommended that different models be fitted to each empirical data set to determine the best deflation model. It was found that the reliability of a simple ratio is affected by the coefficients of variation and the within- and between-trial correlations between the numerator and denominator variables. It was recommended that researchers should compute the reliability of the derived ratio scores and not assume that strong reliabilities in the numerator and denominator measures automatically lead to high reliability in the ratio measures.

  3. Forgetfulness can help you win games.

    PubMed

    Burridge, James; Gao, Yu; Mao, Yong

    2015-09-01

    We present a simple game model where agents with different memory lengths compete for finite resources. We show by simulation and analytically that an instability exists at a critical memory length, and as a result, different memory lengths can compete and coexist in a dynamical equilibrium. Our analytical formulation makes a connection to statistical urn models, and we show that temperature is mirrored by the agent's memory. Our simple model of memory may be incorporated into other game models with implications that we briefly discuss.

  4. Statistical fluctuations in pedestrian evacuation times and the effect of social contagion

    NASA Astrophysics Data System (ADS)

    Nicolas, Alexandre; Bouzat, Sebastián; Kuperman, Marcelo N.

    2016-08-01

    Mathematical models of pedestrian evacuation and the associated simulation software have become essential tools for the assessment of the safety of public facilities and buildings. While a variety of models is now available, their calibration and test against empirical data are generally restricted to global averaged quantities; the statistics compiled from the time series of individual escapes ("microscopic" statistics) measured in recent experiments are thus overlooked. In the same spirit, much research has primarily focused on the average global evacuation time, whereas the whole distribution of evacuation times over some set of realizations should matter. In the present paper we propose and discuss the validity of a simple relation between this distribution and the microscopic statistics, which is theoretically valid in the absence of correlations. To this purpose, we develop a minimal cellular automaton, with features that afford a semiquantitative reproduction of the experimental microscopic statistics. We then introduce a process of social contagion of impatient behavior in the model and show that the simple relation under test may dramatically fail at high contagion strengths, the latter being responsible for the emergence of strong correlations in the system. We conclude with comments on the potential practical relevance for safety science of calculations based on microscopic statistics.

  5. Statistical mechanics of simple models of protein folding and design.

    PubMed Central

    Pande, V S; Grosberg, A Y; Tanaka, T

    1997-01-01

    It is now believed that the primary equilibrium aspects of simple models of protein folding are understood theoretically. However, current theories often resort to rather heavy mathematics to overcome some technical difficulties inherent in the problem or start from a phenomenological model. To this end, we take a new approach in this pedagogical review of the statistical mechanics of protein folding. The benefit of our approach is a drastic mathematical simplification of the theory, without resort to any new approximations or phenomenological prescriptions. Indeed, the results we obtain agree precisely with previous calculations. Because of this simplification, we are able to present here a thorough and self contained treatment of the problem. Topics discussed include the statistical mechanics of the random energy model (REM), tests of the validity of REM as a model for heteropolymer freezing, freezing transition of random sequences, phase diagram of designed ("minimally frustrated") sequences, and the degree to which errors in the interactions employed in simulations of either folding and design can still lead to correct folding behavior. Images FIGURE 2 FIGURE 3 FIGURE 4 FIGURE 6 PMID:9414231

  6. Mathematical neuroscience: from neurons to circuits to systems.

    PubMed

    Gutkin, Boris; Pinto, David; Ermentrout, Bard

    2003-01-01

    Applications of mathematics and computational techniques to our understanding of neuronal systems are provided. Reduction of membrane models to simplified canonical models demonstrates how neuronal spike-time statistics follow from simple properties of neurons. Averaging over space allows one to derive a simple model for the whisker barrel circuit and use this to explain and suggest several experiments. Spatio-temporal pattern formation methods are applied to explain the patterns seen in the early stages of drug-induced visual hallucinations.

  7. Statistical Mechanics of US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward; Broedersz, Chase; Bialek, William; Biophysics Theory Group Team

    2014-03-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The least structured, or maximum entropy, model that is consistent with the observed pairwise correlations among justices' votes is equivalent to an Ising spin glass. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering some of our intuition that justices on opposite sides of the ideological spectrum should have a negative influence on one another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, and this agrees quantitatively with the data. The model shows that voting patterns are organized in a relatively simple ``energy landscape,'' correctly predicts the extent to which each justice is correlated with the majority, and gives us a measure of the influence that justices exert on one another. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context. Funded by National Science Foundation Grants PHY-0957573 and CCF-0939370, WM Keck Foundation, Lewis-Sigler Fellowship, Burroughs Wellcome Fund, and Winston Foundation.

  8. Examination of multi-model ensemble seasonal prediction methods using a simple climate system

    NASA Astrophysics Data System (ADS)

    Kang, In-Sik; Yoo, Jin Ho

    2006-02-01

    A simple climate model was designed as a proxy for the real climate system, and a number of prediction models were generated by slightly perturbing the physical parameters of the simple model. A set of long (240 years) historical hindcast predictions were performed with various prediction models, which are used to examine various issues of multi-model ensemble seasonal prediction, such as the best ways of blending multi-models and the selection of models. Based on these results, we suggest a feasible way of maximizing the benefit of using multi models in seasonal prediction. In particular, three types of multi-model ensemble prediction systems, i.e., the simple composite, superensemble, and the composite after statistically correcting individual predictions (corrected composite), are examined and compared to each other. The superensemble has more of an overfitting problem than the others, especially for the case of small training samples and/or weak external forcing, and the corrected composite produces the best prediction skill among the multi-model systems.

  9. Simple Statistics: - Summarized!

    ERIC Educational Resources Information Center

    Blai, Boris, Jr.

    Statistics are an essential tool for making proper judgement decisions. It is concerned with probability distribution models, testing of hypotheses, significance tests and other means of determining the correctness of deductions and the most likely outcome of decisions. Measures of central tendency include the mean, median and mode. A second…

  10. Masquerade Detection Using a Taxonomy-Based Multinomial Modeling Approach in UNIX Systems

    DTIC Science & Technology

    2008-08-25

    primarily the modeling of statistical features , such as the frequency of events, the duration of events, the co- occurrence of multiple events...are identified, we can extract features representing such behavior while auditing the user’s behavior. Figure1: Taxonomy of Linux and Unix...achieved when the features are extracted just from simple commands. Method Hit Rate False Positive Rate ocSVM using simple cmds (freq.-based

  11. Indiana chronic disease management program risk stratification analysis.

    PubMed

    Li, Jingjin; Holmes, Ann M; Rosenman, Marc B; Katz, Barry P; Downs, Stephen M; Murray, Michael D; Ackermann, Ronald T; Inui, Thomas S

    2005-10-01

    The objective of this study was to compare the ability of risk stratification models derived from administrative data to classify groups of patients for enrollment in a tailored chronic disease management program. This study included 19,548 Medicaid patients with chronic heart failure or diabetes in the Indiana Medicaid data warehouse during 2001 and 2002. To predict costs (total claims paid) in FY 2002, we considered candidate predictor variables available in FY 2001, including patient characteristics, the number and type of prescription medications, laboratory tests, pharmacy charges, and utilization of primary, specialty, inpatient, emergency department, nursing home, and home health care. We built prospective models to identify patients with different levels of expenditure. Model fit was assessed using R statistics, whereas discrimination was assessed using the weighted kappa statistic, predictive ratios, and the area under the receiver operating characteristic curve. We found a simple least-squares regression model in which logged total charges in FY 2002 were regressed on the log of total charges in FY 2001, the number of prescriptions filled in FY 2001, and the FY 2001 eligibility category, performed as well as more complex models. This simple 3-parameter model had an R of 0.30 and, in terms in classification efficiency, had a sensitivity of 0.57, a specificity of 0.90, an area under the receiver operator curve of 0.80, and a weighted kappa statistic of 0.51. This simple model based on readily available administrative data stratified Medicaid members according to predicted future utilization as well as more complicated models.

  12. Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

    PubMed

    Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

    2014-09-01

    Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Standard Entropy of Crystalline Iodine from Vapor Pressure Measurements: A Physical Chemistry Experiment.

    ERIC Educational Resources Information Center

    Harris, Ronald M.

    1978-01-01

    Presents material dealing with an application of statistical thermodynamics to the diatomic solid I-2(s). The objective is to enhance the student's appreciation of the power of the statistical formulation of thermodynamics. The Simple Einstein Model is used. (Author/MA)

  14. Monte Carlo based statistical power analysis for mediation models: methods and software.

    PubMed

    Zhang, Zhiyong

    2014-12-01

    The existing literature on statistical power analysis for mediation models often assumes data normality and is based on a less powerful Sobel test instead of the more powerful bootstrap test. This study proposes to estimate statistical power to detect mediation effects on the basis of the bootstrap method through Monte Carlo simulation. Nonnormal data with excessive skewness and kurtosis are allowed in the proposed method. A free R package called bmem is developed to conduct the power analysis discussed in this study. Four examples, including a simple mediation model, a multiple-mediator model with a latent mediator, a multiple-group mediation model, and a longitudinal mediation model, are provided to illustrate the proposed method.

  15. Weighted Feature Significance: A Simple, Interpretable Model of Compound Toxicity Based on the Statistical Enrichment of Structural Features

    PubMed Central

    Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.

    2009-01-01

    In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409

  16. Calibration of Response Data Using MIRT Models with Simple and Mixed Structures

    ERIC Educational Resources Information Center

    Zhang, Jinming

    2012-01-01

    It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter…

  17. Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability.

    PubMed

    Coman, Emil N; Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J; Suggs, Suzanne; Barbour, Russell

    2014-05-01

    The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power.

  18. Bayesian models based on test statistics for multiple hypothesis testing problems.

    PubMed

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  19. Robust Combining of Disparate Classifiers Through Order Statistics

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2001-01-01

    Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.

  20. Counting statistics for genetic switches based on effective interaction approximation

    NASA Astrophysics Data System (ADS)

    Ohkubo, Jun

    2012-09-01

    Applicability of counting statistics for a system with an infinite number of states is investigated. The counting statistics has been studied a lot for a system with a finite number of states. While it is possible to use the scheme in order to count specific transitions in a system with an infinite number of states in principle, we have non-closed equations in general. A simple genetic switch can be described by a master equation with an infinite number of states, and we use the counting statistics in order to count the number of transitions from inactive to active states in the gene. To avoid having the non-closed equations, an effective interaction approximation is employed. As a result, it is shown that the switching problem can be treated as a simple two-state model approximately, which immediately indicates that the switching obeys non-Poisson statistics.

  1. Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

    NASA Astrophysics Data System (ADS)

    Passemier, Damien; McKay, Matthew R.; Chen, Yang

    2015-07-01

    Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.

  2. Information Entropy Production of Maximum Entropy Markov Chains from Spike Trains

    NASA Astrophysics Data System (ADS)

    Cofré, Rodrigo; Maldonado, Cesar

    2018-01-01

    We consider the maximum entropy Markov chain inference approach to characterize the collective statistics of neuronal spike trains, focusing on the statistical properties of the inferred model. We review large deviations techniques useful in this context to describe properties of accuracy and convergence in terms of sampling size. We use these results to study the statistical fluctuation of correlations, distinguishability and irreversibility of maximum entropy Markov chains. We illustrate these applications using simple examples where the large deviation rate function is explicitly obtained for maximum entropy models of relevance in this field.

  3. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  4. Self-organization of cosmic radiation pressure instability. II - One-dimensional simulations

    NASA Technical Reports Server (NTRS)

    Hogan, Craig J.; Woods, Jorden

    1992-01-01

    The clustering of statistically uniform discrete absorbing particles moving solely under the influence of radiation pressure from uniformly distributed emitters is studied in a simple one-dimensional model. Radiation pressure tends to amplify statistical clustering in the absorbers; the absorbing material is swept into empty bubbles, the biggest bubbles grow bigger almost as they would in a uniform medium, and the smaller ones get crushed and disappear. Numerical simulations of a one-dimensional system are used to support the conjecture that the system is self-organizing. Simple statistics indicate that a wide range of initial conditions produce structure approaching the same self-similar statistical distribution, whose scaling properties follow those of the attractor solution for an isolated bubble. The importance of the process for large-scale structuring of the interstellar medium is briefly discussed.

  5. A brief introduction to computer-intensive methods, with a view towards applications in spatial statistics and stereology.

    PubMed

    Mattfeldt, Torsten

    2011-04-01

    Computer-intensive methods may be defined as data analytical procedures involving a huge number of highly repetitive computations. We mention resampling methods with replacement (bootstrap methods), resampling methods without replacement (randomization tests) and simulation methods. The resampling methods are based on simple and robust principles and are largely free from distributional assumptions. Bootstrap methods may be used to compute confidence intervals for a scalar model parameter and for summary statistics from replicated planar point patterns, and for significance tests. For some simple models of planar point processes, point patterns can be simulated by elementary Monte Carlo methods. The simulation of models with more complex interaction properties usually requires more advanced computing methods. In this context, we mention simulation of Gibbs processes with Markov chain Monte Carlo methods using the Metropolis-Hastings algorithm. An alternative to simulations on the basis of a parametric model consists of stochastic reconstruction methods. The basic ideas behind the methods are briefly reviewed and illustrated by simple worked examples in order to encourage novices in the field to use computer-intensive methods. © 2010 The Authors Journal of Microscopy © 2010 Royal Microscopical Society.

  6. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels

    PubMed Central

    Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J.

    2014-01-01

    This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively “hiding” its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research. PMID:25505378

  7. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels.

    PubMed

    Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J

    2014-01-01

    This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively "hiding" its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research.

  8. Keep it simple - A case study of model development in the context of the Dynamic Stocks and Flows (DSF) task

    NASA Astrophysics Data System (ADS)

    Halbrügge, Marc

    2010-12-01

    This paper describes the creation of a cognitive model submitted to the ‘Dynamic Stocks and Flows’ (DSF) modeling challenge. This challenge aims at comparing computational cognitive models for human behavior during an open ended control task. Participants in the modeling competition were provided with a simulation environment and training data for benchmarking their models while the actual specification of the competition task was withheld. To meet this challenge, the cognitive model described here was designed and optimized for generalizability. Only two simple assumptions about human problem solving were used to explain the empirical findings of the training data. In-depth analysis of the data set prior to the development of the model led to the dismissal of correlations or other parametric statistics as goodness-of-fit indicators. A new statistical measurement based on rank orders and sequence matching techniques is being proposed instead. This measurement, when being applied to the human sample, also identifies clusters of subjects that use different strategies for the task. The acceptability of the fits achieved by the model is verified using permutation tests.

  9. The power to detect linkage in complex disease by means of simple LOD-score analyses.

    PubMed Central

    Greenberg, D A; Abreu, P; Hodge, S E

    1998-01-01

    Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage. PMID:9718328

  10. The power to detect linkage in complex disease by means of simple LOD-score analyses.

    PubMed

    Greenberg, D A; Abreu, P; Hodge, S E

    1998-09-01

    Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage.

  11. "Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs"

    ERIC Educational Resources Information Center

    Konstantopoulos, Spyros

    2009-01-01

    Power computations for one-level experimental designs that assume simple random samples are greatly facilitated by power tables such as those presented in Cohen's book about statistical power analysis. However, in education and the social sciences experimental designs have naturally nested structures and multilevel models are needed to compute the…

  12. Empirical Reference Distributions for Networks of Different Size

    PubMed Central

    Smith, Anna; Calder, Catherine A.; Browning, Christopher R.

    2016-01-01

    Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556

  13. Obscure phenomena in statistical analysis of quantitative structure-activity relationships. Part 1: Multicollinearity of physicochemical descriptors.

    PubMed

    Mager, P P; Rothe, H

    1990-10-01

    Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.

  14. Statistical self-similarity of width function maxima with implications to floods

    USGS Publications Warehouse

    Veitzer, S.A.; Gupta, V.K.

    2001-01-01

    Recently a new theory of random self-similar river networks, called the RSN model, was introduced to explain empirical observations regarding the scaling properties of distributions of various topologic and geometric variables in natural basins. The RSN model predicts that such variables exhibit statistical simple scaling, when indexed by Horton-Strahler order. The average side tributary structure of RSN networks also exhibits Tokunaga-type self-similarity which is widely observed in nature. We examine the scaling structure of distributions of the maximum of the width function for RSNs for nested, complete Strahler basins by performing ensemble simulations. The maximum of the width function exhibits distributional simple scaling, when indexed by Horton-Strahler order, for both RSNs and natural river networks extracted from digital elevation models (DEMs). We also test a powerlaw relationship between Horton ratios for the maximum of the width function and drainage areas. These results represent first steps in formulating a comprehensive physical statistical theory of floods at multiple space-time scales for RSNs as discrete hierarchical branching structures. ?? 2001 Published by Elsevier Science Ltd.

  15. Modeling Smoke Plume-Rise and Dispersion from Southern United States Prescribed Burns with Daysmoke

    Treesearch

    G L Achtemeier; S L Goodrick; Y Liu; F Garcia-Menendez; Y Hu; M. Odman

    2011-01-01

    We present Daysmoke, an empirical-statistical plume rise and dispersion model for simulating smoke from prescribed burns. Prescribed fires are characterized by complex plume structure including multiple-core updrafts which makes modeling with simple plume models difficult. Daysmoke accounts for plume structure in a three-dimensional veering/sheering atmospheric...

  16. Canonical Statistical Model for Maximum Expected Immission of Wire Conductor in an Aperture Enclosure

    NASA Technical Reports Server (NTRS)

    Bremner, Paul G.; Vazquez, Gabriel; Christiano, Daniel J.; Trout, Dawn H.

    2016-01-01

    Prediction of the maximum expected electromagnetic pick-up of conductors inside a realistic shielding enclosure is an important canonical problem for system-level EMC design of space craft, launch vehicles, aircraft and automobiles. This paper introduces a simple statistical power balance model for prediction of the maximum expected current in a wire conductor inside an aperture enclosure. It calculates both the statistical mean and variance of the immission from the physical design parameters of the problem. Familiar probability density functions can then be used to predict the maximum expected immission for deign purposes. The statistical power balance model requires minimal EMC design information and solves orders of magnitude faster than existing numerical models, making it ultimately viable for scaled-up, full system-level modeling. Both experimental test results and full wave simulation results are used to validate the foundational model.

  17. The predictive power of zero intelligence in financial markets.

    PubMed

    Farmer, J Doyne; Patelli, Paolo; Zovko, Ilija I

    2005-02-08

    Standard models in economics stress the role of intelligent agents who maximize utility. However, there may be situations where constraints imposed by market institutions dominate strategic agent behavior. We use data from the London Stock Exchange to test a simple model in which minimally intelligent agents place orders to trade at random. The model treats the statistical mechanics of order placement, price formation, and the accumulation of revealed supply and demand within the context of the continuous double auction and yields simple laws relating order-arrival rates to statistical properties of the market. We test the validity of these laws in explaining cross-sectional variation for 11 stocks. The model explains 96% of the variance of the gap between the best buying and selling prices (the spread) and 76% of the variance of the price diffusion rate, with only one free parameter. We also study the market impact function, describing the response of quoted prices to the arrival of new orders. The nondimensional coordinates dictated by the model approximately collapse data from different stocks onto a single curve. This work is important from a practical point of view, because it demonstrates the existence of simple laws relating prices to order flows and, in a broader context, suggests there are circumstances where the strategic behavior of agents may be dominated by other considerations.

  18. Statistical analysis of strait time index and a simple model for trend and trend reversal

    NASA Astrophysics Data System (ADS)

    Chen, Kan; Jayaprakash, C.

    2003-06-01

    We analyze the daily closing prices of the Strait Time Index (STI) as well as the individual stocks traded in Singapore's stock market from 1988 to 2001. We find that the Hurst exponent is approximately 0.6 for both the STI and individual stocks, while the normal correlation functions show the random walk exponent of 0.5. We also investigate the conditional average of the price change in an interval of length T given the price change in the previous interval. We find strong correlations for price changes larger than a threshold value proportional to T; this indicates that there is no uniform crossover to Gaussian behavior. A simple model based on short-time trend and trend reversal is constructed. We show that the model exhibits statistical properties and market swings similar to those of the real market.

  19. Fitting mechanistic epidemic models to data: A comparison of simple Markov chain Monte Carlo approaches.

    PubMed

    Li, Michael; Dushoff, Jonathan; Bolker, Benjamin M

    2018-07-01

    Simple mechanistic epidemic models are widely used for forecasting and parameter estimation of infectious diseases based on noisy case reporting data. Despite the widespread application of models to emerging infectious diseases, we know little about the comparative performance of standard computational-statistical frameworks in these contexts. Here we build a simple stochastic, discrete-time, discrete-state epidemic model with both process and observation error and use it to characterize the effectiveness of different flavours of Bayesian Markov chain Monte Carlo (MCMC) techniques. We use fits to simulated data, where parameters (and future behaviour) are known, to explore the limitations of different platforms and quantify parameter estimation accuracy, forecasting accuracy, and computational efficiency across combinations of modeling decisions (e.g. discrete vs. continuous latent states, levels of stochasticity) and computational platforms (JAGS, NIMBLE, Stan).

  20. ASSESSMENT OF SPATIAL AUTOCORRELATION IN EMPIRICAL MODELS IN ECOLOGY

    EPA Science Inventory

    Statistically assessing ecological models is inherently difficult because data are autocorrelated and this autocorrelation varies in an unknown fashion. At a simple level, the linking of a single species to a habitat type is a straightforward analysis. With some investigation int...

  1. A simple non-Markovian computational model of the statistics of soccer leagues: Emergence and scaling effects

    NASA Astrophysics Data System (ADS)

    da Silva, Roberto; Vainstein, Mendeli H.; Lamb, Luis C.; Prado, Sandra D.

    2013-03-01

    We propose a novel probabilistic model that outputs the final standings of a soccer league, based on a simple dynamics that mimics a soccer tournament. In our model, a team is created with a defined potential (ability) which is updated during the tournament according to the results of previous games. The updated potential modifies a team future winning/losing probabilities. We show that this evolutionary game is able to reproduce the statistical properties of final standings of actual editions of the Brazilian tournament (Brasileirão) if the starting potential is the same for all teams. Other leagues such as the Italian (Calcio) and the Spanish (La Liga) tournaments have notoriously non-Gaussian traces and cannot be straightforwardly reproduced by this evolutionary non-Markovian model with simple initial conditions. However, we show that by setting the initial abilities based on data from previous tournaments, our model is able to capture the stylized statistical features of double round robin system (DRRS) tournaments in general. A complete understanding of these phenomena deserves much more attention, but we suggest a simple explanation based on data collected in Brazil: here several teams have been crowned champion in previous editions corroborating that the champion typically emerges from random fluctuations that partly preserve the Gaussian traces during the tournament. On the other hand, in the Italian and Spanish cases, only a few teams in recent history have won their league tournaments. These leagues are based on more robust and hierarchical structures established even before the beginning of the tournament. For the sake of completeness, we also elaborate a totally Gaussian model (which equalizes the winning, drawing, and losing probabilities) and we show that the scores of the Brazilian tournament “Brasileirão” cannot be reproduced. This shows that the evolutionary aspects are not superfluous and play an important role which must be considered in other alternative models. Finally, we analyze the distortions of our model in situations where a large number of teams is considered, showing the existence of a transition from a single to a double peaked histogram of the final classification scores. An interesting scaling is presented for different sized tournaments.

  2. The predictive power of zero intelligence in financial markets

    NASA Astrophysics Data System (ADS)

    Farmer, J. Doyne; Patelli, Paolo; Zovko, Ilija I.

    2005-02-01

    Standard models in economics stress the role of intelligent agents who maximize utility. However, there may be situations where constraints imposed by market institutions dominate strategic agent behavior. We use data from the London Stock Exchange to test a simple model in which minimally intelligent agents place orders to trade at random. The model treats the statistical mechanics of order placement, price formation, and the accumulation of revealed supply and demand within the context of the continuous double auction and yields simple laws relating order-arrival rates to statistical properties of the market. We test the validity of these laws in explaining cross-sectional variation for 11 stocks. The model explains 96% of the variance of the gap between the best buying and selling prices (the spread) and 76% of the variance of the price diffusion rate, with only one free parameter. We also study the market impact function, describing the response of quoted prices to the arrival of new orders. The nondimensional coordinates dictated by the model approximately collapse data from different stocks onto a single curve. This work is important from a practical point of view, because it demonstrates the existence of simple laws relating prices to order flows and, in a broader context, suggests there are circumstances where the strategic behavior of agents may be dominated by other considerations. double auction market | market microstructure | agent-based models

  3. Large ensemble modeling of last deglacial retreat of the West Antarctic Ice Sheet: comparison of simple and advanced statistical techniques

    NASA Astrophysics Data System (ADS)

    Pollard, D.; Chang, W.; Haran, M.; Applegate, P.; DeConto, R.

    2015-11-01

    A 3-D hybrid ice-sheet model is applied to the last deglacial retreat of the West Antarctic Ice Sheet over the last ~ 20 000 years. A large ensemble of 625 model runs is used to calibrate the model to modern and geologic data, including reconstructed grounding lines, relative sea-level records, elevation-age data and uplift rates, with an aggregate score computed for each run that measures overall model-data misfit. Two types of statistical methods are used to analyze the large-ensemble results: simple averaging weighted by the aggregate score, and more advanced Bayesian techniques involving Gaussian process-based emulation and calibration, and Markov chain Monte Carlo. Results for best-fit parameter ranges and envelopes of equivalent sea-level rise with the simple averaging method agree quite well with the more advanced techniques, but only for a large ensemble with full factorial parameter sampling. Best-fit parameter ranges confirm earlier values expected from prior model tuning, including large basal sliding coefficients on modern ocean beds. Each run is extended 5000 years into the "future" with idealized ramped climate warming. In the majority of runs with reasonable scores, this produces grounding-line retreat deep into the West Antarctic interior, and the analysis provides sea-level-rise envelopes with well defined parametric uncertainty bounds.

  4. A Simple Model for Estimating Total and Merchantable Tree Heights

    Treesearch

    Alan R. Ek; Earl T. Birdsall; Rebecca J. Spears

    1984-01-01

    A model is described for estimating total and merchantable tree heights for Lake States tree species. It is intended to be used for compiling forest survey data and in conjunction with growth models for developing projections of tree product yield. Model coefficients are given for 25 species along with fit statistics. Supporting data sets are also described.

  5. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  6. Modified two-sources quantum statistical model and multiplicity fluctuation in the finite rapidity region

    NASA Astrophysics Data System (ADS)

    Ghosh, Dipak; Sarkar, Sharmila; Sen, Sanjib; Roy, Jaya

    1995-06-01

    In this paper the behavior of factorial moments with rapidity window size, which is usually explained in terms of ``intermittency,'' has been interpreted by simple quantum statistical properties of the emitting system using the concept of ``modified two-source model'' as recently proposed by Ghosh and Sarkar [Phys. Lett. B 278, 465 (1992)]. The analysis has been performed using our own data of 16Ag/Br and 24Ag/Br interactions at a few tens of GeV energy regime.

  7. On a simple molecular–statistical model of a liquid-crystal suspension of anisometric particles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zakhlevnykh, A. N., E-mail: anz@psu.ru; Lubnin, M. S.; Petrov, D. A.

    2016-11-15

    A molecular–statistical mean-field theory is constructed for suspensions of anisometric particles in nematic liquid crystals (NLCs). The spherical approximation, well known in the physics of ferromagnetic materials, is considered that allows one to obtain an analytic expression for the free energy and simple equations for the orientational state of a suspension that describe the temperature dependence of the order parameters of the suspension components. The transition temperature from ordered to isotropic state and the jumps in the order parameters at the phase-transition point are studied as a function of the anchoring energy of dispersed particles to the matrix, the concentrationmore » of the impurity phase, and the size of particles. The proposed approach allows one to generalize the model to the case of biaxial ordering.« less

  8. Statistics of the geomagnetic secular variation for the past 5Ma

    NASA Technical Reports Server (NTRS)

    Constable, C. G.; Parker, R. L.

    1986-01-01

    A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.

  9. Statistics of the geomagnetic secular variation for the past 5 m.y

    NASA Technical Reports Server (NTRS)

    Constable, C. G.; Parker, R. L.

    1988-01-01

    A new statistical model is proposed for the geomagnetic secular variation over the past 5Ma. Unlike previous models, the model makes use of statistical characteristics of the present day geomagnetic field. The spatial power spectrum of the non-dipole field is consistent with a white source near the core-mantle boundary with Gaussian distribution. After a suitable scaling, the spherical harmonic coefficients may be regarded as statistical samples from a single giant Gaussian process; this is the model of the non-dipole field. The model can be combined with an arbitrary statistical description of the dipole and probability density functions and cumulative distribution functions can be computed for declination and inclination that would be observed at any site on Earth's surface. Global paleomagnetic data spanning the past 5Ma are used to constrain the statistics of the dipole part of the field. A simple model is found to be consistent with the available data. An advantage of specifying the model in terms of the spherical harmonic coefficients is that it is a complete statistical description of the geomagnetic field, enabling us to test specific properties for a general description. Both intensity and directional data distributions may be tested to see if they satisfy the expected model distributions.

  10. The predictive power of zero intelligence in financial markets

    PubMed Central

    Farmer, J. Doyne; Patelli, Paolo; Zovko, Ilija I.

    2005-01-01

    Standard models in economics stress the role of intelligent agents who maximize utility. However, there may be situations where constraints imposed by market institutions dominate strategic agent behavior. We use data from the London Stock Exchange to test a simple model in which minimally intelligent agents place orders to trade at random. The model treats the statistical mechanics of order placement, price formation, and the accumulation of revealed supply and demand within the context of the continuous double auction and yields simple laws relating order-arrival rates to statistical properties of the market. We test the validity of these laws in explaining cross-sectional variation for 11 stocks. The model explains 96% of the variance of the gap between the best buying and selling prices (the spread) and 76% of the variance of the price diffusion rate, with only one free parameter. We also study the market impact function, describing the response of quoted prices to the arrival of new orders. The nondimensional coordinates dictated by the model approximately collapse data from different stocks onto a single curve. This work is important from a practical point of view, because it demonstrates the existence of simple laws relating prices to order flows and, in a broader context, suggests there are circumstances where the strategic behavior of agents may be dominated by other considerations. PMID:15687505

  11. Scaling laws and fluctuations in the statistics of word frequencies

    NASA Astrophysics Data System (ADS)

    Gerlach, Martin; Altmann, Eduardo G.

    2014-11-01

    In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths.

  12. Random noise effects in pulse-mode digital multilayer neural networks.

    PubMed

    Kim, Y C; Shanblatt, M A

    1995-01-01

    A pulse-mode digital multilayer neural network (DMNN) based on stochastic computing techniques is implemented with simple logic gates as basic computing elements. The pulse-mode signal representation and the use of simple logic gates for neural operations lead to a massively parallel yet compact and flexible network architecture, well suited for VLSI implementation. Algebraic neural operations are replaced by stochastic processes using pseudorandom pulse sequences. The distributions of the results from the stochastic processes are approximated using the hypergeometric distribution. Synaptic weights and neuron states are represented as probabilities and estimated as average pulse occurrence rates in corresponding pulse sequences. A statistical model of the noise (error) is developed to estimate the relative accuracy associated with stochastic computing in terms of mean and variance. Computational differences are then explained by comparison to deterministic neural computations. DMNN feedforward architectures are modeled in VHDL using character recognition problems as testbeds. Computational accuracy is analyzed, and the results of the statistical model are compared with the actual simulation results. Experiments show that the calculations performed in the DMNN are more accurate than those anticipated when Bernoulli sequences are assumed, as is common in the literature. Furthermore, the statistical model successfully predicts the accuracy of the operations performed in the DMNN.

  13. Statistical Models for Averaging of the Pump–Probe Traces: Example of Denoising in Terahertz Time-Domain Spectroscopy

    NASA Astrophysics Data System (ADS)

    Skorobogatiy, Maksim; Sadasivan, Jayesh; Guerboukha, Hichem

    2018-05-01

    In this paper, we first discuss the main types of noise in a typical pump-probe system, and then focus specifically on terahertz time domain spectroscopy (THz-TDS) setups. We then introduce four statistical models for the noisy pulses obtained in such systems, and detail rigorous mathematical algorithms to de-noise such traces, find the proper averages and characterise various types of experimental noise. Finally, we perform a comparative analysis of the performance, advantages and limitations of the algorithms by testing them on the experimental data collected using a particular THz-TDS system available in our laboratories. We conclude that using advanced statistical models for trace averaging results in the fitting errors that are significantly smaller than those obtained when only a simple statistical average is used.

  14. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    USGS Publications Warehouse

    Furey, P.R.; Troutman, B.M.

    2008-01-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.

  15. Analysis and modeling of wafer-level process variability in 28 nm FD-SOI using split C-V measurements

    NASA Astrophysics Data System (ADS)

    Pradeep, Krishna; Poiroux, Thierry; Scheer, Patrick; Juge, André; Gouget, Gilles; Ghibaudo, Gérard

    2018-07-01

    This work details the analysis of wafer level global process variability in 28 nm FD-SOI using split C-V measurements. The proposed approach initially evaluates the native on wafer process variability using efficient extraction methods on split C-V measurements. The on-wafer threshold voltage (VT) variability is first studied and modeled using a simple analytical model. Then, a statistical model based on the Leti-UTSOI compact model is proposed to describe the total C-V variability in different bias conditions. This statistical model is finally used to study the contribution of each process parameter to the total C-V variability.

  16. Probing the exchange statistics of one-dimensional anyon models

    NASA Astrophysics Data System (ADS)

    Greschner, Sebastian; Cardarelli, Lorenzo; Santos, Luis

    2018-05-01

    We propose feasible scenarios for revealing the modified exchange statistics in one-dimensional anyon models in optical lattices based on an extension of the multicolor lattice-depth modulation scheme introduced in [Phys. Rev. A 94, 023615 (2016), 10.1103/PhysRevA.94.023615]. We show that the fast modulation of a two-component fermionic lattice gas in the presence a magnetic field gradient, in combination with additional resonant microwave fields, allows for the quantum simulation of hardcore anyon models with periodic boundary conditions. Such a semisynthetic ring setup allows for realizing an interferometric arrangement sensitive to the anyonic statistics. Moreover, we show as well that simple expansion experiments may reveal the formation of anomalously bound pairs resulting from the anyonic exchange.

  17. New statistical scission-point model to predict fission fragment observables

    NASA Astrophysics Data System (ADS)

    Lemaître, Jean-François; Panebianco, Stefano; Sida, Jean-Luc; Hilaire, Stéphane; Heinrich, Sophie

    2015-09-01

    The development of high performance computing facilities makes possible a massive production of nuclear data in a full microscopic framework. Taking advantage of the individual potential calculations of more than 7000 nuclei, a new statistical scission-point model, called SPY, has been developed. It gives access to the absolute available energy at the scission point, which allows the use of a parameter-free microcanonical statistical description to calculate the distributions and the mean values of all fission observables. SPY uses the richness of microscopy in a rather simple theoretical framework, without any parameter except the scission-point definition, to draw clear answers based on perfect knowledge of the ingredients involved in the model, with very limited computing cost.

  18. Concept of Fractal Dimension use of Multifractal Cloud Liquid Models Based on Real Data as Input to Monte Carlo Radiation Models

    NASA Technical Reports Server (NTRS)

    Wiscombe, W.

    1999-01-01

    The purpose of this paper is discuss the concept of fractal dimension; multifractal statistics as an extension of this; the use of simple multifractal statistics (power spectrum, structure function) to characterize cloud liquid water data; and to understand the use of multifractal cloud liquid water models based on real data as input to Monte Carlo radiation models of shortwave radiation transfer in 3D clouds, and the consequences of this in two areas: the design of aircraft field programs to measure cloud absorptance; and the explanation of the famous "Landsat scale break" in measured radiance.

  19. A simple branching model that reproduces language family and language population distributions

    NASA Astrophysics Data System (ADS)

    Schwämmle, Veit; de Oliveira, Paulo Murilo Castro

    2009-07-01

    Human history leaves fingerprints in human languages. Little is known about language evolution and its study is of great importance. Here we construct a simple stochastic model and compare its results to statistical data of real languages. The model is based on the recent finding that language changes occur independently of the population size. We find agreement with the data additionally assuming that languages may be distinguished by having at least one among a finite, small number of different features. This finite set is also used in order to define the distance between two languages, similarly to linguistics tradition since Swadesh.

  20. Physical models of collective cell motility: from cell to tissue

    NASA Astrophysics Data System (ADS)

    Camley, B. A.; Rappel, W.-J.

    2017-03-01

    In this article, we review physics-based models of collective cell motility. We discuss a range of techniques at different scales, ranging from models that represent cells as simple self-propelled particles to phase field models that can represent a cell’s shape and dynamics in great detail. We also extensively review the ways in which cells within a tissue choose their direction, the statistics of cell motion, and some simple examples of how cell-cell signaling can interact with collective cell motility. This review also covers in more detail selected recent works on collective cell motion of small numbers of cells on micropatterns, in wound healing, and the chemotaxis of clusters of cells.

  1. The Two-Dimensional Gabor Function Adapted to Natural Image Statistics: A Model of Simple-Cell Receptive Fields and Sparse Structure in Images.

    PubMed

    Loxley, P N

    2017-10-01

    The two-dimensional Gabor function is adapted to natural image statistics, leading to a tractable probabilistic generative model that can be used to model simple cell receptive field profiles, or generate basis functions for sparse coding applications. Learning is found to be most pronounced in three Gabor function parameters representing the size and spatial frequency of the two-dimensional Gabor function and characterized by a nonuniform probability distribution with heavy tails. All three parameters are found to be strongly correlated, resulting in a basis of multiscale Gabor functions with similar aspect ratios and size-dependent spatial frequencies. A key finding is that the distribution of receptive-field sizes is scale invariant over a wide range of values, so there is no characteristic receptive field size selected by natural image statistics. The Gabor function aspect ratio is found to be approximately conserved by the learning rules and is therefore not well determined by natural image statistics. This allows for three distinct solutions: a basis of Gabor functions with sharp orientation resolution at the expense of spatial-frequency resolution, a basis of Gabor functions with sharp spatial-frequency resolution at the expense of orientation resolution, or a basis with unit aspect ratio. Arbitrary mixtures of all three cases are also possible. Two parameters controlling the shape of the marginal distributions in a probabilistic generative model fully account for all three solutions. The best-performing probabilistic generative model for sparse coding applications is found to be a gaussian copula with Pareto marginal probability density functions.

  2. Complex emergence patterns in a bark beetle predator

    Treesearch

    John D. Reeve

    2000-01-01

    The emergence pattern of Thanasimus dubius (F.) (Coleoptera: Cleridae), a common predator of the southern pine beetle, Dendroctonus frontalis Zimmermann (Coleoptera: Scolytidae), was studied under field conditions across different seasons. A simple statistical model was then developed...

  3. Patch-Based Generative Shape Model and MDL Model Selection for Statistical Analysis of Archipelagos

    NASA Astrophysics Data System (ADS)

    Ganz, Melanie; Nielsen, Mads; Brandt, Sami

    We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning a patch-based dictionary for possible shapes, (2) building up a time-homogeneous Markov model to model the neighbourhood correlations between the patches, and (3) automatic selection of the model complexity by the minimum description length principle. The generative shape model is proposed as a probability distribution of a binary image where the model is intended to facilitate sequential simulation. Our results show that a relatively simple model is able to generate structures visually similar to calcifications. Furthermore, we used the shape model as a shape prior in the statistical segmentation of calcifications, where the area overlap with the ground truth shapes improved significantly compared to the case where the prior was not used.

  4. A statistical method for measuring activation of gene regulatory networks.

    PubMed

    Esteves, Gustavo H; Reis, Luiz F L

    2018-06-13

    Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.

  5. Pattern statistics on Markov chains and sensitivity to parameter estimation

    PubMed Central

    Nuel, Grégory

    2006-01-01

    Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). Results: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation. PMID:17044916

  6. Pattern statistics on Markov chains and sensitivity to parameter estimation.

    PubMed

    Nuel, Grégory

    2006-10-17

    In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of sigma, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.

  7. Predicting lettuce canopy photosynthesis with statistical and neural network models

    NASA Technical Reports Server (NTRS)

    Frick, J.; Precetti, C.; Mitchell, C. A.

    1998-01-01

    An artificial neural network (NN) and a statistical regression model were developed to predict canopy photosynthetic rates (Pn) for 'Waldman's Green' leaf lettuce (Latuca sativa L.). All data used to develop and test the models were collected for crop stands grown hydroponically and under controlled-environment conditions. In the NN and regression models, canopy Pn was predicted as a function of three independent variables: shootzone CO2 concentration (600 to 1500 micromoles mol-1), photosynthetic photon flux (PPF) (600 to 1100 micromoles m-2 s-1), and canopy age (10 to 20 days after planting). The models were used to determine the combinations of CO2 and PPF setpoints required each day to maintain maximum canopy Pn. The statistical model (a third-order polynomial) predicted Pn more accurately than the simple NN (a three-layer, fully connected net). Over an 11-day validation period, average percent difference between predicted and actual Pn was 12.3% and 24.6% for the statistical and NN models, respectively. Both models lost considerable accuracy when used to determine relatively long-range Pn predictions (> or = 6 days into the future).

  8. Rates of profit as correlated sums of random variables

    NASA Astrophysics Data System (ADS)

    Greenblatt, R. E.

    2013-10-01

    Profit realization is the dominant feature of market-based economic systems, determining their dynamics to a large extent. Rather than attaining an equilibrium, profit rates vary widely across firms, and the variation persists over time. Differing definitions of profit result in differing empirical distributions. To study the statistical properties of profit rates, I used data from a publicly available database for the US Economy for 2009-2010 (Risk Management Association). For each of three profit rate measures, the sample space consists of 771 points. Each point represents aggregate data from a small number of US manufacturing firms of similar size and type (NAICS code of principal product). When comparing the empirical distributions of profit rates, significant ‘heavy tails’ were observed, corresponding principally to a number of firms with larger profit rates than would be expected from simple models. An apparently novel correlated sum of random variables statistical model was used to model the data. In the case of operating and net profit rates, a number of firms show negative profits (losses), ruling out simple gamma or lognormal distributions as complete models for these data.

  9. Boosting Bayesian parameter inference of stochastic differential equation models with methods from statistical physics

    NASA Astrophysics Data System (ADS)

    Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

    2016-04-01

    Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.

  10. Modified two-sources quantum statistical model and multiplicity fluctuation in the finite rapidity region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghosh, D.; Sarkar, S.; Sen, S.

    1995-06-01

    In this paper the behavior of factorial moments with rapidity window size, which is usually explained in terms of ``intermittency,`` has been interpreted by simple quantum statistical properties of the emitting system using the concept of ``modified two-source model`` as recently proposed by Ghosh and Sarkar [Phys. Lett. B 278, 465 (1992)]. The analysis has been performed using our own data of {sup 16}O-Ag/Br and {sup 24}Mg-Ag/Br interactions at a few tens of GeV energy regime.

  11. Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.

    PubMed

    Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P

    2017-03-01

    The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Comparative Research Productivity Measures for Economic Departments.

    ERIC Educational Resources Information Center

    Huettner, David A.; Clark, William

    1997-01-01

    Develops a simple theoretical model to evaluate interdisciplinary differences in research productivity between economics departments and related subjects. Compares the research publishing statistics of economics, finance, psychology, geology, physics, oceanography, chemistry, and geophysics. Considers a number of factors including journal…

  13. Perspective: Sloppiness and emergent theories in physics, biology, and beyond.

    PubMed

    Transtrum, Mark K; Machta, Benjamin B; Brown, Kevin S; Daniels, Bryan C; Myers, Christopher R; Sethna, James P

    2015-07-07

    Large scale models of physical phenomena demand the development of new statistical and computational tools in order to be effective. Many such models are "sloppy," i.e., exhibit behavior controlled by a relatively small number of parameter combinations. We review an information theoretic framework for analyzing sloppy models. This formalism is based on the Fisher information matrix, which is interpreted as a Riemannian metric on a parameterized space of models. Distance in this space is a measure of how distinguishable two models are based on their predictions. Sloppy model manifolds are bounded with a hierarchy of widths and extrinsic curvatures. The manifold boundary approximation can extract the simple, hidden theory from complicated sloppy models. We attribute the success of simple effective models in physics as likewise emerging from complicated processes exhibiting a low effective dimensionality. We discuss the ramifications and consequences of sloppy models for biochemistry and science more generally. We suggest that the reason our complex world is understandable is due to the same fundamental reason: simple theories of macroscopic behavior are hidden inside complicated microscopic processes.

  14. Statistical Emulation of Climate Model Projections Based on Precomputed GCM Runs*

    DOE PAGES

    Castruccio, Stefano; McInerney, David J.; Stein, Michael L.; ...

    2014-02-24

    The authors describe a new approach for emulating the output of a fully coupled climate model under arbitrary forcing scenarios that is based on a small set of precomputed runs from the model. Temperature and precipitation are expressed as simple functions of the past trajectory of atmospheric CO 2 concentrations, and a statistical model is fit using a limited set of training runs. The approach is demonstrated to be a useful and computationally efficient alternative to pattern scaling and captures the nonlinear evolution of spatial patterns of climate anomalies inherent in transient climates. The approach does as well as patternmore » scaling in all circumstances and substantially better in many; it is not computationally demanding; and, once the statistical model is fit, it produces emulated climate output effectively instantaneously. In conclusion, it may therefore find wide application in climate impacts assessments and other policy analyses requiring rapid climate projections.« less

  15. Artificial neural network study on organ-targeting peptides

    NASA Astrophysics Data System (ADS)

    Jung, Eunkyoung; Kim, Junhyoung; Choi, Seung-Hoon; Kim, Minkyoung; Rhee, Hokyoung; Shin, Jae-Min; Choi, Kihang; Kang, Sang-Kee; Lee, Nam Kyung; Choi, Yun-Jaie; Jung, Dong Hyun

    2010-01-01

    We report a new approach to studying organ targeting of peptides on the basis of peptide sequence information. The positive control data sets consist of organ-targeting peptide sequences identified by the peroral phage-display technique for four organs, and the negative control data are prepared from random sequences. The capacity of our models to make appropriate predictions is validated by statistical indicators including sensitivity, specificity, enrichment curve, and the area under the receiver operating characteristic (ROC) curve (the ROC score). VHSE descriptor produces statistically significant training models and the models with simple neural network architectures show slightly greater predictive power than those with complex ones. The training and test set statistics indicate that our models could discriminate between organ-targeting and random sequences. We anticipate that our models will be applicable to the selection of organ-targeting peptides for generating peptide drugs or peptidomimetics.

  16. On the (In)Validity of Tests of Simple Mediation: Threats and Solutions

    PubMed Central

    Pek, Jolynn; Hoyle, Rick H.

    2015-01-01

    Mediation analysis is a popular framework for identifying underlying mechanisms in social psychology. In the context of simple mediation, we review and discuss the implications of three facets of mediation analysis: (a) conceptualization of the relations between the variables, (b) statistical approaches, and (c) relevant elements of design. We also highlight the issue of equivalent models that are inherent in simple mediation. The extent to which results are meaningful stem directly from choices regarding these three facets of mediation analysis. We conclude by discussing how mediation analysis can be better applied to examine causal processes, highlight the limits of simple mediation, and make recommendations for better practice. PMID:26985234

  17. The Problem of Auto-Correlation in Parasitology

    PubMed Central

    Pollitt, Laura C.; Reece, Sarah E.; Mideo, Nicole; Nussey, Daniel H.; Colegrave, Nick

    2012-01-01

    Explaining the contribution of host and pathogen factors in driving infection dynamics is a major ambition in parasitology. There is increasing recognition that analyses based on single summary measures of an infection (e.g., peak parasitaemia) do not adequately capture infection dynamics and so, the appropriate use of statistical techniques to analyse dynamics is necessary to understand infections and, ultimately, control parasites. However, the complexities of within-host environments mean that tracking and analysing pathogen dynamics within infections and among hosts poses considerable statistical challenges. Simple statistical models make assumptions that will rarely be satisfied in data collected on host and parasite parameters. In particular, model residuals (unexplained variance in the data) should not be correlated in time or space. Here we demonstrate how failure to account for such correlations can result in incorrect biological inference from statistical analysis. We then show how mixed effects models can be used as a powerful tool to analyse such repeated measures data in the hope that this will encourage better statistical practices in parasitology. PMID:22511865

  18. Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments

    NASA Technical Reports Server (NTRS)

    Abbey, Craig K.; Eckstein, Miguel P.

    2002-01-01

    We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.

  19. General Blending Models for Data From Mixture Experiments

    PubMed Central

    Brown, L.; Donev, A. N.; Bissett, A. C.

    2015-01-01

    We propose a new class of models providing a powerful unification and extension of existing statistical methodology for analysis of data obtained in mixture experiments. These models, which integrate models proposed by Scheffé and Becker, extend considerably the range of mixture component effects that may be described. They become complex when the studied phenomenon requires it, but remain simple whenever possible. This article has supplementary material online. PMID:26681812

  20. Final Report for Dynamic Models for Causal Analysis of Panel Data. Models for Change in Quantitative Variables, Part II Scholastic Models. Part II, Chapter 4.

    ERIC Educational Resources Information Center

    Hannan, Michael T.

    This document is part of a series of chapters described in SO 011 759. Stochastic models for the sociological analysis of change and the change process in quantitative variables are presented. The author lays groundwork for the statistical treatment of simple stochastic differential equations (SDEs) and discusses some of the continuities of…

  1. A simple microstructure return model explaining microstructure noise and Epps effects

    NASA Astrophysics Data System (ADS)

    Saichev, A.; Sornette, D.

    2014-01-01

    We present a novel simple microstructure model of financial returns that combines (i) the well-known ARFIMA process applied to tick-by-tick returns, (ii) the bid-ask bounce effect, (iii) the fat tail structure of the distribution of returns and (iv) the non-Poissonian statistics of inter-trade intervals. This model allows us to explain both qualitatively and quantitatively important stylized facts observed in the statistics of both microstructure and macrostructure returns, including the short-ranged correlation of returns, the long-ranged correlations of absolute returns, the microstructure noise and Epps effects. According to the microstructure noise effect, volatility is a decreasing function of the time-scale used to estimate it. The Epps effect states that cross correlations between asset returns are increasing functions of the time-scale at which the returns are estimated. The microstructure noise is explained as the result of the negative return correlations inherent in the definition of the bid-ask bounce component (ii). In the presence of a genuine correlation between the returns of two assets, the Epps effect is due to an average statistical overlap of the momentum of the returns of the two assets defined over a finite time-scale in the presence of the long memory process (i).

  2. Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

    PubMed Central

    McDermott, Josh H.; Simoncelli, Eero P.

    2014-01-01

    Rainstorms, insect swarms, and galloping horses produce “sound textures” – the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures. However, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation. PMID:21903084

  3. Combining Statistics and Physics to Improve Climate Downscaling

    NASA Astrophysics Data System (ADS)

    Gutmann, E. D.; Eidhammer, T.; Arnold, J.; Nowak, K.; Clark, M. P.

    2017-12-01

    Getting useful information from climate models is an ongoing problem that has plagued climate science and hydrologic prediction for decades. While it is possible to develop statistical corrections for climate models that mimic current climate almost perfectly, this does not necessarily guarantee that future changes are portrayed correctly. In contrast, convection permitting regional climate models (RCMs) have begun to provide an excellent representation of the regional climate system purely from first principles, providing greater confidence in their change signal. However, the computational cost of such RCMs prohibits the generation of ensembles of simulations or long time periods, thus limiting their applicability for hydrologic applications. Here we discuss a new approach combining statistical corrections with physical relationships for a modest computational cost. We have developed the Intermediate Complexity Atmospheric Research model (ICAR) to provide a climate and weather downscaling option that is based primarily on physics for a fraction of the computational requirements of a traditional regional climate model. ICAR also enables the incorporation of statistical adjustments directly within the model. We demonstrate that applying even simple corrections to precipitation while the model is running can improve the simulation of land atmosphere feedbacks in ICAR. For example, by incorporating statistical corrections earlier in the modeling chain, we permit the model physics to better represent the effect of mountain snowpack on air temperature changes.

  4. Estimation of critical behavior from the density of states in classical statistical models

    NASA Astrophysics Data System (ADS)

    Malakis, A.; Peratzakis, A.; Fytas, N. G.

    2004-12-01

    We present a simple and efficient approximation scheme which greatly facilitates the extension of Wang-Landau sampling (or similar techniques) in large systems for the estimation of critical behavior. The method, presented in an algorithmic approach, is based on a very simple idea, familiar in statistical mechanics from the notion of thermodynamic equivalence of ensembles and the central limit theorem. It is illustrated that we can predict with high accuracy the critical part of the energy space and by using this restricted part we can extend our simulations to larger systems and improve the accuracy of critical parameters. It is proposed that the extensions of the finite-size critical part of the energy space, determining the specific heat, satisfy a scaling law involving the thermal critical exponent. The method is applied successfully for the estimation of the scaling behavior of specific heat of both square and simple cubic Ising lattices. The proposed scaling law is verified by estimating the thermal critical exponent from the finite-size behavior of the critical part of the energy space. The density of states of the zero-field Ising model on these lattices is obtained via a multirange Wang-Landau sampling.

  5. Evolution of cosmic string networks

    NASA Technical Reports Server (NTRS)

    Albrecht, Andreas; Turok, Neil

    1989-01-01

    A discussion of the evolution and observable consequences of a network of cosmic strings is given. A simple model for the evolution of the string network is presented, and related to the statistical mechanics of string networks. The model predicts the long string density throughout the history of the universe from a single parameter, which researchers calculate in radiation era simulations. The statistical mechanics arguments indicate a particular thermal form for the spectrum of loops chopped off the network. Detailed numerical simulations of string networks in expanding backgrounds are performed to test the model. Consequences for large scale structure, the microwave and gravity wave backgrounds, nucleosynthesis and gravitational lensing are calculated.

  6. Using the Graded Response Model to Control Spurious Interactions in Moderated Multiple Regression

    ERIC Educational Resources Information Center

    Morse, Brendan J.; Johanson, George A.; Griffeth, Rodger W.

    2012-01-01

    Recent simulation research has demonstrated that using simple raw score to operationalize a latent construct can result in inflated Type I error rates for the interaction term of a moderated statistical model when the interaction (or lack thereof) is proposed at the latent variable level. Rescaling the scores using an appropriate item response…

  7. A Simple Effect Size Estimator for Single Case Designs Using WinBUGS

    ERIC Educational Resources Information Center

    Rindskopf, David; Shadish, William; Hedges, Larry V.

    2012-01-01

    This conference presentation demonstrates a multilevel model for analyzing single case designs. The model is implemented in the Bayesian program WinBUGS. The authors show how it is possible to estimate a d-statistic like the one in Hedges, Pustejovsky and Shadish (2012) in this program. Results are demonstrated on an example.

  8. Quantifying and Testing Indirect Effects in Simple Mediation Models when the Constituent Paths Are Nonlinear

    ERIC Educational Resources Information Center

    Hayes, Andrew F.; Preacher, Kristopher J.

    2010-01-01

    Most treatments of indirect effects and mediation in the statistical methods literature and the corresponding methods used by behavioral scientists have assumed linear relationships between variables in the causal system. Here we describe and extend a method first introduced by Stolzenberg (1980) for estimating indirect effects in models of…

  9. Helping Students Assess the Relative Importance of Different Intermolecular Interactions

    ERIC Educational Resources Information Center

    Jasien, Paul G.

    2008-01-01

    A semi-quantitative model has been developed to estimate the relative effects of dispersion, dipole-dipole interactions, and H-bonding on the normal boiling points ("T[subscript b]") for a subset of simple organic systems. The model is based upon a statistical analysis using multiple linear regression on a series of straight-chain organic…

  10. Upgrades to the REA method for producing probabilistic climate change projections

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Gao, Xuejie; Giorgi, Filippo

    2010-05-01

    We present an augmented version of the Reliability Ensemble Averaging (REA) method designed to generate probabilistic climate change information from ensembles of climate model simulations. Compared to the original version, the augmented one includes consideration of multiple variables and statistics in the calculation of the performance-based weights. In addition, the model convergence criterion previously employed is removed. The method is applied to the calculation of changes in mean and variability for temperature and precipitation over different sub-regions of East Asia based on the recently completed CMIP3 multi-model ensemble. Comparison of the new and old REA methods, along with the simple averaging procedure, and the use of different combinations of performance metrics shows that at fine sub-regional scales the choice of weighting is relevant. This is mostly because the models show a substantial spread in performance for the simulation of precipitation statistics, a result that supports the use of model weighting as a useful option to account for wide ranges of quality of models. The REA method, and in particular the upgraded one, provides a simple and flexible framework for assessing the uncertainty related to the aggregation of results from ensembles of models in order to produce climate change information at the regional scale. KEY WORDS: REA method, Climate change, CMIP3

  11. A methodology for the design of experiments in computational intelligence with multiple regression models.

    PubMed

    Fernandez-Lozano, Carlos; Gestal, Marcos; Munteanu, Cristian R; Dorado, Julian; Pazos, Alejandro

    2016-01-01

    The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable.

  12. A methodology for the design of experiments in computational intelligence with multiple regression models

    PubMed Central

    Gestal, Marcos; Munteanu, Cristian R.; Dorado, Julian; Pazos, Alejandro

    2016-01-01

    The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable. PMID:27920952

  13. Probabilistic Evaluation of Competing Climate Models

    NASA Astrophysics Data System (ADS)

    Braverman, A. J.; Chatterjee, S.; Heyman, M.; Cressie, N.

    2017-12-01

    A standard paradigm for assessing the quality of climate model simulations is to compare what these models produce for past and present time periods, to observations of the past and present. Many of these comparisons are based on simple summary statistics called metrics. Here, we propose an alternative: evaluation of competing climate models through probabilities derived from tests of the hypothesis that climate-model-simulated and observed time sequences share common climate-scale signals. The probabilities are based on the behavior of summary statistics of climate model output and observational data, over ensembles of pseudo-realizations. These are obtained by partitioning the original time sequences into signal and noise components, and using a parametric bootstrap to create pseudo-realizations of the noise sequences. The statistics we choose come from working in the space of decorrelated and dimension-reduced wavelet coefficients. We compare monthly sequences of CMIP5 model output of average global near-surface temperature anomalies to similar sequences obtained from the well-known HadCRUT4 data set, as an illustration.

  14. Two Simple Models for Fracking

    NASA Astrophysics Data System (ADS)

    Norris, Jaren Quinn

    Recent developments in fracking have enable the recovery of oil and gas from tight shale reservoirs. These developments have also made fracking one of the most controversial environmental issues in the United States. Despite the growing controversy surrounding fracking, there is relatively little publicly available research. This dissertation introduces two simple models for fracking that were developed using techniques from non-linear and statistical physics. The first model assumes that the volume of induced fractures must be equal to the volume of injected fluid. For simplicity, these fractures are assumed to form a spherically symmetric damage region around the borehole. The predicted volumes of water necessary to create a damage region with a given radius are in good agreement with reported values. The second model is a modification of invasion percolation which was previously introduced to model water flooding. The reservoir rock is represented by a regular lattice of local traps that contain oil and/or gas separated by rock barriers. The barriers are assumed to be highly heterogeneous and are assigned random strengths. Fluid is injected from a central site and the weakest rock barrier breaks allowing fluid to flow into the adjacent site. The process repeats with the weakest barrier breaking and fluid flowing to an adjacent site each time step. Extensive numerical simulations were carried out to obtain statistical properties of the growing fracture network. The network was found to be fractal with fractal dimensions differing slightly from the accepted values for traditional percolation. Additionally, the network follows Horton-Strahler and Tokunaga branching statistics which have been used to characterize river networks. As with other percolation models, the growth of the network occurs in bursts. These bursts follow a power-law size distribution similar to observed microseismic events. Reservoir stress anisotropy is incorporated into the model by assigning horizontal bonds weaker strengths on average than vertical bonds. Numerical simulations show that increasing bond strength anisotropy tends to reduce the fractal dimension of the growing fracture network, and decrease the power-law slope of the burst size distribution. Although simple, these two models are useful for making informed decisions about fracking.

  15. On-line estimation of error covariance parameters for atmospheric data assimilation

    NASA Technical Reports Server (NTRS)

    Dee, Dick P.

    1995-01-01

    A simple scheme is presented for on-line estimation of covariance parameters in statistical data assimilation systems. The scheme is based on a maximum-likelihood approach in which estimates are produced on the basis of a single batch of simultaneous observations. Simple-sample covariance estimation is reasonable as long as the number of available observations exceeds the number of tunable parameters by two or three orders of magnitude. Not much is known at present about model error associated with actual forecast systems. Our scheme can be used to estimate some important statistical model error parameters such as regionally averaged variances or characteristic correlation length scales. The advantage of the single-sample approach is that it does not rely on any assumptions about the temporal behavior of the covariance parameters: time-dependent parameter estimates can be continuously adjusted on the basis of current observations. This is of practical importance since it is likely to be the case that both model error and observation error strongly depend on the actual state of the atmosphere. The single-sample estimation scheme can be incorporated into any four-dimensional statistical data assimilation system that involves explicit calculation of forecast error covariances, including optimal interpolation (OI) and the simplified Kalman filter (SKF). The computational cost of the scheme is high but not prohibitive; on-line estimation of one or two covariance parameters in each analysis box of an operational bozed-OI system is currently feasible. A number of numerical experiments performed with an adaptive SKF and an adaptive version of OI, using a linear two-dimensional shallow-water model and artificially generated model error are described. The performance of the nonadaptive versions of these methods turns out to depend rather strongly on correct specification of model error parameters. These parameters are estimated under a variety of conditions, including uniformly distributed model error and time-dependent model error statistics.

  16. Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.

    PubMed

    Monica, Stefania; Ferrari, Gianluigi

    2018-05-17

    Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.

  17. Statistics of acoustic emissions and stress drops during granular shearing using a stick-slip fiber bundle mode

    NASA Astrophysics Data System (ADS)

    Cohen, D.; Michlmayr, G.; Or, D.

    2012-04-01

    Shearing of dense granular materials appears in many engineering and Earth sciences applications. Under a constant strain rate, the shearing stress at steady state oscillates with slow rises followed by rapid drops that are linked to the build up and failure of force chains. Experiments indicate that these drops display exponential statistics. Measurements of acoustic emissions during shearing indicates that the energy liberated by failure of these force chains has power-law statistics. Representing force chains as fibers, we use a stick-slip fiber bundle model to obtain analytical solutions of the statistical distribution of stress drops and failure energy. In the model, fibers stretch, fail, and regain strength during deformation. Fibers have Weibull-distributed threshold strengths with either quenched and annealed disorder. The shape of the distribution for drops and energy obtained from the model are similar to those measured during shearing experiments. This simple model may be useful to identify failure events linked to force chain failures. Future generalizations of the model that include different types of fiber failure may also allow identification of different types of granular failures that have distinct statistical acoustic emission signatures.

  18. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  19. A simple model for research interest evolution patterns

    NASA Astrophysics Data System (ADS)

    Jia, Tao; Wang, Dashun; Szymanski, Boleslaw

    Sir Isaac Newton supposedly remarked that in his scientific career he was like ``...a boy playing on the sea-shore ...finding a smoother pebble or a prettier shell than ordinary''. His remarkable modesty and famous understatement motivate us to seek regularities in how scientists shift their research focus as the career develops. Indeed, despite intensive investigations on how microscopic factors, such as incentives and risks, would influence a scientist's choice of research agenda, little is known on the macroscopic patterns in the research interest change undertaken by individual scientists throughout their careers. Here we make use of over 14,000 authors' publication records in physics. By quantifying statistical characteristics in the interest evolution, we model scientific research as a random walk, which reproduces patterns in individuals' careers observed empirically. Despite myriad of factors that shape and influence individual choices of research subjects, we identified regularities in this dynamical process that are well captured by a simple statistical model. The results advance our understanding of scientists' behaviors during their careers and open up avenues for future studies in the science of science.

  20. Conditional statistical inference with multistage testing designs.

    PubMed

    Zwitser, Robert J; Maris, Gunter

    2015-03-01

    In this paper it is demonstrated how statistical inference from multistage test designs can be made based on the conditional likelihood. Special attention is given to parameter estimation, as well as the evaluation of model fit. Two reasons are provided why the fit of simple measurement models is expected to be better in adaptive designs, compared to linear designs: more parameters are available for the same number of observations; and undesirable response behavior, like slipping and guessing, might be avoided owing to a better match between item difficulty and examinee proficiency. The results are illustrated with simulated data, as well as with real data.

  1. Millimeter wave attenuation prediction using a piecewise uniform rain rate model

    NASA Technical Reports Server (NTRS)

    Persinger, R. R.; Stutzman, W. L.; Bostian, C. W.; Castle, R. E., Jr.

    1980-01-01

    A piecewise uniform rain rate distribution model is introduced as a quasi-physical model of real rain along earth-space millimeter wave propagation paths. It permits calculation of the total attenuation from specific attenuation in a simple fashion. The model predications are verified by comparison with direct attenuation measurements for several frequencies, elevation angles, and locations. Also, coupled with the Rice-Holmberg rain rate model, attenuation statistics are predicated from rainfall accumulation data.

  2. Liquid-liquid critical point in a simple analytical model of water.

    PubMed

    Urbic, Tomaz

    2016-10-01

    A statistical model for a simple three-dimensional Mercedes-Benz model of water was used to study phase diagrams. This model on a simple level describes the thermal and volumetric properties of waterlike molecules. A molecule is presented as a soft sphere with four directions in which hydrogen bonds can be formed. Two neighboring waters can interact through a van der Waals interaction or an orientation-dependent hydrogen-bonding interaction. For pure water, we explored properties such as molar volume, density, heat capacity, thermal expansion coefficient, and isothermal compressibility and found that the volumetric and thermal properties follow the same trends with temperature as in real water and are in good general agreement with Monte Carlo simulations. The model exhibits also two critical points for liquid-gas transition and transition between low-density and high-density fluid. Coexistence curves and a Widom line for the maximum and minimum in thermal expansion coefficient divides the phase space of the model into three parts: in one part we have gas region, in the second a high-density liquid, and the third region contains low-density liquid.

  3. Liquid-liquid critical point in a simple analytical model of water

    NASA Astrophysics Data System (ADS)

    Urbic, Tomaz

    2016-10-01

    A statistical model for a simple three-dimensional Mercedes-Benz model of water was used to study phase diagrams. This model on a simple level describes the thermal and volumetric properties of waterlike molecules. A molecule is presented as a soft sphere with four directions in which hydrogen bonds can be formed. Two neighboring waters can interact through a van der Waals interaction or an orientation-dependent hydrogen-bonding interaction. For pure water, we explored properties such as molar volume, density, heat capacity, thermal expansion coefficient, and isothermal compressibility and found that the volumetric and thermal properties follow the same trends with temperature as in real water and are in good general agreement with Monte Carlo simulations. The model exhibits also two critical points for liquid-gas transition and transition between low-density and high-density fluid. Coexistence curves and a Widom line for the maximum and minimum in thermal expansion coefficient divides the phase space of the model into three parts: in one part we have gas region, in the second a high-density liquid, and the third region contains low-density liquid.

  4. A simple hydrodynamic model of tornado-like vortices

    NASA Astrophysics Data System (ADS)

    Kurgansky, M. V.

    2015-05-01

    Based on similarity arguments, a simple fluid dynamic model of tornado-like vortices is offered that, with account for "vortex breakdown" at a certain height above the ground, relates the maximal azimuthal velocity in the vortex, reachable near the ground surface, to the convective available potential energy (CAPE) stored in the environmental atmosphere under pre-tornado conditions. The relative proportion of the helicity (kinetic energy) destruction (dissipation) in the "vortex breakdown" zone and, accordingly, within the surface boundary layer beneath the vortex is evaluated. These considerations form the basis of the dynamic-statistical analysis of the relationship between the tornado intensity and the CAPE budget in the surrounding atmosphere.

  5. Ignoring the Innocent: Non-combatants in Urban Operations and in Military Models and Simulations

    DTIC Science & Technology

    2006-01-01

    such a model yields is a sufficiency theorem , a single run does not provide any information on the robustness of such theorems . That is, given that...often formally resolvable via inspection, simple differentiation, the implicit function theorem , comparative statistics, and so on. The only way to... Pythagoras , and Bactowars. For each, Grieger discusses model parameters, data collection, terrain, and other features. Grieger also discusses

  6. Prediction of the dollar to the ruble rate. A system-theoretic approach

    NASA Astrophysics Data System (ADS)

    Borodachev, Sergey M.

    2017-07-01

    Proposed a simple state-space model of dollar rate formation based on changes in oil prices and some mechanisms of money transfer between monetary and stock markets. Comparison of predictions by means of input-output model and state-space model is made. It concludes that with proper use of statistical data (Kalman filter) the second approach provides more adequate predictions of the dollar rate.

  7. A model for two-dimensional bursty turbulence in magnetized plasmas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Servidio, Sergio; Primavera, Leonardo; Carbone, Vincenzo

    2008-01-15

    The nonlinear dynamics of two-dimensional electrostatic interchange modes in a magnetized plasma is investigated through a simple model that replaces the instability mechanism due to magnetic field curvature by an external source of vorticity and mass. Simulations in a cylindrical domain, with a spatially localized and randomized source at the center of the domain, reveal the eruption of mushroom-shaped bursts that propagate radially and are absorbed by the boundaries. Burst sizes and the interburst waiting times exhibit power-law statistics, which indicates long-range interburst correlations, similar to what has been found in sandpile models for avalanching systems. It is shown frommore » the simulations that the dynamics can be characterized by a Yaglom relation for the third-order mixed moment involving the particle number density as a passive scalar and the ExB drift velocity, and hence that the burst phenomenology can be described within the framework of turbulence theory. Statistical features are qualitatively in agreement with experiments of intermittent transport at the edge of plasma devices, and suggest that essential features such as transport can be described by this simple model of bursty turbulence.« less

  8. Energy Savings Analysis for Energy Monitoring and Control Systems

    DTIC Science & Technology

    1995-01-01

    for evaluating design and construction a:-0 quality, and for studying the effectiveness of air - tightening AC retrofits. No simple relationship...Energy These models of residential infiltration are based on statistical "Resource Center (1983) include information on air tightening in fits of

  9. The coalescent process in models with selection and recombination.

    PubMed

    Hudson, R R; Kaplan, N L

    1988-11-01

    The statistical properties of the process describing the genealogical history of a random sample of genes at a selectively neutral locus which is linked to a locus at which natural selection operates are investigated. It is found that the equations describing this process are simple modifications of the equations describing the process assuming that the two loci are completely linked. Thus, the statistical properties of the genealogical process for a random sample at a neutral locus linked to a locus with selection follow from the results obtained for the selected locus. Sequence data from the alcohol dehydrogenase (Adh) region of Drosophila melanogaster are examined and compared to predictions based on the theory. It is found that the spatial distribution of nucleotide differences between Fast and Slow alleles of Adh is very similar to the spatial distribution predicted if balancing selection operates to maintain the allozyme variation at the Adh locus. The spatial distribution of nucleotide differences between different Slow alleles of Adh do not match the predictions of this simple model very well.

  10. Bootstrap Methods: A Very Leisurely Look.

    ERIC Educational Resources Information Center

    Hinkle, Dennis E.; Winstead, Wayland H.

    The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…

  11. Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

    PubMed

    Mørk, Søren; Holmes, Ian

    2012-03-01

    Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.

  12. Applying the compound Poisson process model to the reporting of injury-related mortality rates.

    PubMed

    Kegler, Scott R

    2007-02-16

    Injury-related mortality rate estimates are often analyzed under the assumption that case counts follow a Poisson distribution. Certain types of injury incidents occasionally involve multiple fatalities, however, resulting in dependencies between cases that are not reflected in the simple Poisson model and which can affect even basic statistical analyses. This paper explores the compound Poisson process model as an alternative, emphasizing adjustments to some commonly used interval estimators for population-based rates and rate ratios. The adjusted estimators involve relatively simple closed-form computations, which in the absence of multiple-case incidents reduce to familiar estimators based on the simpler Poisson model. Summary data from the National Violent Death Reporting System are referenced in several examples demonstrating application of the proposed methodology.

  13. A Simple Simulation Technique for Nonnormal Data with Prespecified Skewness, Kurtosis, and Covariance Matrix.

    PubMed

    Foldnes, Njål; Olsson, Ulf Henning

    2016-01-01

    We present and investigate a simple way to generate nonnormal data using linear combinations of independent generator (IG) variables. The simulated data have prespecified univariate skewness and kurtosis and a given covariance matrix. In contrast to the widely used Vale-Maurelli (VM) transform, the obtained data are shown to have a non-Gaussian copula. We analytically obtain asymptotic robustness conditions for the IG distribution. We show empirically that popular test statistics in covariance analysis tend to reject true models more often under the IG transform than under the VM transform. This implies that overly optimistic evaluations of estimators and fit statistics in covariance structure analysis may be tempered by including the IG transform for nonnormal data generation. We provide an implementation of the IG transform in the R environment.

  14. A statistical model of brittle fracture by transgranular cleavage

    NASA Astrophysics Data System (ADS)

    Lin, Tsann; Evans, A. G.; Ritchie, R. O.

    A MODEL for brittle fracture by transgranular cleavage cracking is presented based on the application of weakest link statistics to the critical microstructural fracture mechanisms. The model permits prediction of the macroscopic fracture toughness, KI c, in single phase microstructures containing a known distribution of particles, and defines the critical distance from the crack tip at which the initial cracking event is most probable. The model is developed for unstable fracture ahead of a sharp crack considering both linear elastic and nonlinear elastic ("elastic/plastic") crack tip stress fields. Predictions are evaluated by comparison with experimental results on the low temperature flow and fracture behavior of a low carbon mild steel with a simple ferrite/grain boundary carbide microstructure.

  15. Matrix population models from 20 studies of perennial plant populations

    USGS Publications Warehouse

    Ellis, Martha M.; Williams, Jennifer L.; Lesica, Peter; Bell, Timothy J.; Bierzychudek, Paulette; Bowles, Marlin; Crone, Elizabeth E.; Doak, Daniel F.; Ehrlen, Johan; Ellis-Adam, Albertine; McEachern, Kathryn; Ganesan, Rengaian; Latham, Penelope; Luijten, Sheila; Kaye, Thomas N.; Knight, Tiffany M.; Menges, Eric S.; Morris, William F.; den Nijs, Hans; Oostermeijer, Gerard; Quintana-Ascencio, Pedro F.; Shelly, J. Stephen; Stanley, Amanda; Thorpe, Andrea; Tamara, Ticktin; Valverde, Teresa; Weekley, Carl W.

    2012-01-01

    Demographic transition matrices are one of the most commonly applied population models for both basic and applied ecological research. The relatively simple framework of these models and simple, easily interpretable summary statistics they produce have prompted the wide use of these models across an exceptionally broad range of taxa. Here, we provide annual transition matrices and observed stage structures/population sizes for 20 perennial plant species which have been the focal species for long-term demographic monitoring. These data were assembled as part of the "Testing Matrix Models" working group through the National Center for Ecological Analysis and Synthesis (NCEAS). In sum, these data represent 82 populations with >460 total population-years of data. It is our hope that making these data available will help promote and improve our ability to monitor and understand plant population dynamics.

  16. Matrix population models from 20 studies of perennial plant populations

    USGS Publications Warehouse

    Ellis, Martha M.; Williams, Jennifer L.; Lesica, Peter; Bell, Timothy J.; Bierzychudek, Paulette; Bowles, Marlin; Crone, Elizabeth E.; Doak, Daniel F.; Ehrlen, Johan; Ellis-Adam, Albertine; McEachern, Kathryn; Ganesan, Rengaian; Latham, Penelope; Luijten, Sheila; Kaye, Thomas N.; Knight, Tiffany M.; Menges, Eric S.; Morris, William F.; den Nijs, Hans; Oostermeijer, Gerard; Quintana-Ascencio, Pedro F.; Shelly, J. Stephen; Stanley, Amanda; Thorpe, Andrea; Tamara, Ticktin; Valverde, Teresa; Weekley, Carl W.

    2012-01-01

    Demographic transition matrices are one of the most commonly applied population models for both basic and applied ecological research. The relatively simple framework of these models and simple, easily interpretable summary statistics they produce have prompted the wide use of these models across an exceptionally broad range of taxa. Here, we provide annual transition matrices and observed stage structures/population sizes for 20 perennial plant species which have been the focal species for long-term demographic monitoring. These data were assembled as part of the 'Testing Matrix Models' working group through the National Center for Ecological Analysis and Synthesis (NCEAS). In sum, these data represent 82 populations with >460 total population-years of data. It is our hope that making these data available will help promote and improve our ability to monitor and understand plant population dynamics.

  17. Chaotic oscillations and noise transformations in a simple dissipative system with delayed feedback

    NASA Astrophysics Data System (ADS)

    Zverev, V. V.; Rubinstein, B. Ya.

    1991-04-01

    We analyze the statistical behavior of signals in nonlinear circuits with delayed feedback in the presence of external Markovian noise. For the special class of circuits with intense phase mixing we develop an approach for the computation of the probability distributions and multitime correlation functions based on the random phase approximation. Both Gaussian and Kubo-Andersen models of external noise statistics are analyzed and the existence of the stationary (asymptotic) random process in the long-time limit is shown. We demonstrate that a nonlinear system with chaotic behavior becomes a noise amplifier with specific statistical transformation properties.

  18. Microgravity experiments on vibrated granular gases in a dilute regime: non-classical statistics

    NASA Astrophysics Data System (ADS)

    Leconte, M.; Garrabos, Y.; Falcon, E.; Lecoutre-Chabot, C.; Palencia, F.; Évesque, P.; Beysens, D.

    2006-07-01

    We report on an experimental study of a dilute gas of steel spheres colliding inelastically and excited by a piston performing sinusoidal vibration, in low gravity. Using improved experimental apparatus, here we present some results concerning the collision statistics of particles on a wall of the container. We also propose a simple model where the non-classical statistics obtained from our data are attributed to the boundary condition playing the role of a 'velostat' instead of a thermostat. The significant differences from the kinetic theory of usual gas are related to the inelasticity of collisions.

  19. Score tests for independence in semiparametric competing risks models.

    PubMed

    Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul

    2009-12-01

    A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.

  20. Twice random, once mixed: applying mixed models to simultaneously analyze random effects of language and participants.

    PubMed

    Janssen, Dirk P

    2012-03-01

    Psychologists, psycholinguists, and other researchers using language stimuli have been struggling for more than 30 years with the problem of how to analyze experimental data that contain two crossed random effects (items and participants). The classical analysis of variance does not apply; alternatives have been proposed but have failed to catch on, and a statistically unsatisfactory procedure of using two approximations (known as F(1) and F(2)) has become the standard. A simple and elegant solution using mixed model analysis has been available for 15 years, and recent improvements in statistical software have made mixed models analysis widely available. The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages. The article also introduces the DJMIXED: add-on package for SPSS, which makes entering the models and reporting their results as straightforward as possible.

  1. Ecological statistics of Gestalt laws for the perceptual organization of contours.

    PubMed

    Elder, James H; Goldberg, Richard M

    2002-01-01

    Although numerous studies have measured the strength of visual grouping cues for controlled psychophysical stimuli, little is known about the statistical utility of these various cues for natural images. In this study, we conducted experiments in which human participants trace perceived contours in natural images. These contours are automatically mapped to sequences of discrete tangent elements detected in the image. By examining relational properties between pairs of successive tangents on these traced curves, and between randomly selected pairs of tangents, we are able to estimate the likelihood distributions required to construct an optimal Bayesian model for contour grouping. We employed this novel methodology to investigate the inferential power of three classical Gestalt cues for contour grouping: proximity, good continuation, and luminance similarity. The study yielded a number of important results: (1) these cues, when appropriately defined, are approximately uncorrelated, suggesting a simple factorial model for statistical inference; (2) moderate image-to-image variation of the statistics indicates the utility of general probabilistic models for perceptual organization; (3) these cues differ greatly in their inferential power, proximity being by far the most powerful; and (4) statistical modeling of the proximity cue indicates a scale-invariant power law in close agreement with prior psychophysics.

  2. Applying the multivariate time-rescaling theorem to neural population models

    PubMed Central

    Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon

    2011-01-01

    Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436

  3. A simple dynamic subgrid-scale model for LES of particle-laden turbulence

    NASA Astrophysics Data System (ADS)

    Park, George Ilhwan; Bassenne, Maxime; Urzay, Javier; Moin, Parviz

    2017-04-01

    In this study, a dynamic model for large-eddy simulations is proposed in order to describe the motion of small inertial particles in turbulent flows. The model is simple, involves no significant computational overhead, contains no adjustable parameters, and is flexible enough to be deployed in any type of flow solvers and grids, including unstructured setups. The approach is based on the use of elliptic differential filters to model the subgrid-scale velocity. The only model parameter, which is related to the nominal filter width, is determined dynamically by imposing consistency constraints on the estimated subgrid energetics. The performance of the model is tested in large-eddy simulations of homogeneous-isotropic turbulence laden with particles, where improved agreement with direct numerical simulation results is observed in the dispersed-phase statistics, including particle acceleration, local carrier-phase velocity, and preferential-concentration metrics.

  4. Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

    NASA Astrophysics Data System (ADS)

    Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

    2018-07-01

    Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.

  5. Probabilistic regional climate projection in Japan using a regression model with CMIP5 multi-model ensemble experiments

    NASA Astrophysics Data System (ADS)

    Ishizaki, N. N.; Dairaku, K.; Ueno, G.

    2016-12-01

    We have developed a statistical downscaling method for estimating probabilistic climate projection using CMIP5 multi general circulation models (GCMs). A regression model was established so that the combination of weights of GCMs reflects the characteristics of the variation of observations at each grid point. Cross validations were conducted to select GCMs and to evaluate the regression model to avoid multicollinearity. By using spatially high resolution observation system, we conducted statistically downscaled probabilistic climate projections with 20-km horizontal grid spacing. Root mean squared errors for monthly mean air surface temperature and precipitation estimated by the regression method were the smallest compared with the results derived from a simple ensemble mean of GCMs and a cumulative distribution function based bias correction method. Projected changes in the mean temperature and precipitation were basically similar to those of the simple ensemble mean of GCMs. Mean precipitation was generally projected to increase associated with increased temperature and consequent increased moisture content in the air. Weakening of the winter monsoon may affect precipitation decrease in some areas. Temperature increase in excess of 4 K was expected in most areas of Japan in the end of 21st century under RCP8.5 scenario. The estimated probability of monthly precipitation exceeding 300 mm would increase around the Pacific side during the summer and the Japan Sea side during the winter season. This probabilistic climate projection based on the statistical method can be expected to bring useful information to the impact studies and risk assessments.

  6. An Easy Tool to Predict Survival in Patients Receiving Radiation Therapy for Painful Bone Metastases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Westhoff, Paulien G., E-mail: p.g.westhoff@umcutrecht.nl; Graeff, Alexander de; Monninkhof, Evelyn M.

    2014-11-15

    Purpose: Patients with bone metastases have a widely varying survival. A reliable estimation of survival is needed for appropriate treatment strategies. Our goal was to assess the value of simple prognostic factors, namely, patient and tumor characteristics, Karnofsky performance status (KPS), and patient-reported scores of pain and quality of life, to predict survival in patients with painful bone metastases. Methods and Materials: In the Dutch Bone Metastasis Study, 1157 patients were treated with radiation therapy for painful bone metastases. At randomization, physicians determined the KPS; patients rated general health on a visual analogue scale (VAS-gh), valuation of life on amore » verbal rating scale (VRS-vl) and pain intensity. To assess the predictive value of the variables, we used multivariate Cox proportional hazard analyses and C-statistics for discriminative value. Of the final model, calibration was assessed. External validation was performed on a dataset of 934 patients who were treated with radiation therapy for vertebral metastases. Results: Patients had mainly breast (39%), prostate (23%), or lung cancer (25%). After a maximum of 142 weeks' follow-up, 74% of patients had died. The best predictive model included sex, primary tumor, visceral metastases, KPS, VAS-gh, and VRS-vl (C-statistic = 0.72, 95% CI = 0.70-0.74). A reduced model, with only KPS and primary tumor, showed comparable discriminative capacity (C-statistic = 0.71, 95% CI = 0.69-0.72). External validation showed a C-statistic of 0.72 (95% CI = 0.70-0.73). Calibration of the derivation and the validation dataset showed underestimation of survival. Conclusion: In predicting survival in patients with painful bone metastases, KPS combined with primary tumor was comparable to a more complex model. Considering the amount of variables in complex models and the additional burden on patients, the simple model is preferred for daily use. In addition, a risk table for survival is provided.« less

  7. An Econometric Model of External Labor Supply to the Establishment Within a Confined Geographic Market.

    ERIC Educational Resources Information Center

    Hines, Robert James

    The study conducted in the Buffalo, New York standard metropolitan statistical area, was undertaken to formulate and test a simple model of labor supply for a local labor market. The principal variables to be examined to determine the external supply function of labor to the establishment are variants of the rate of change of the entry wage and…

  8. Quantifying Confidence in Model Predictions for Hypersonic Aircraft Structures

    DTIC Science & Technology

    2015-03-01

    of isolating calibrations of models in the network, segmented and simultaneous calibration are compared using the Kullback - Leibler ...value of θ. While not all test -statistics are as simple as measuring goodness or badness of fit , their directional interpretations tend to remain...data quite well, qualitatively. Quantitative goodness - of - fit tests are problematic because they assume a true empirical CDF is being tested or

  9. Introducing Multisensor Satellite Radiance-Based Evaluation for Regional Earth System Modeling

    NASA Technical Reports Server (NTRS)

    Matsui, T.; Santanello, J.; Shi, J. J.; Tao, W.-K.; Wu, D.; Peters-Lidard, C.; Kemp, E.; Chin, M.; Starr, D.; Sekiguchi, M.; hide

    2014-01-01

    Earth System modeling has become more complex, and its evaluation using satellite data has also become more difficult due to model and data diversity. Therefore, the fundamental methodology of using satellite direct measurements with instrumental simulators should be addressed especially for modeling community members lacking a solid background of radiative transfer and scattering theory. This manuscript introduces principles of multisatellite, multisensor radiance-based evaluation methods for a fully coupled regional Earth System model: NASA-Unified Weather Research and Forecasting (NU-WRF) model. We use a NU-WRF case study simulation over West Africa as an example of evaluating aerosol-cloud-precipitation-land processes with various satellite observations. NU-WRF-simulated geophysical parameters are converted to the satellite-observable raw radiance and backscatter under nearly consistent physics assumptions via the multisensor satellite simulator, the Goddard Satellite Data Simulator Unit. We present varied examples of simple yet robust methods that characterize forecast errors and model physics biases through the spatial and statistical interpretation of various satellite raw signals: infrared brightness temperature (Tb) for surface skin temperature and cloud top temperature, microwave Tb for precipitation ice and surface flooding, and radar and lidar backscatter for aerosol-cloud profiling simultaneously. Because raw satellite signals integrate many sources of geophysical information, we demonstrate user-defined thresholds and a simple statistical process to facilitate evaluations, including the infrared-microwave-based cloud types and lidar/radar-based profile classifications.

  10. A common base method for analysis of qPCR data and the application of simple blocking in qPCR experiments.

    PubMed

    Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J

    2017-12-01

    qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.

  11. Mechanics and statistics of the worm-like chain

    NASA Astrophysics Data System (ADS)

    Marantan, Andrew; Mahadevan, L.

    2018-02-01

    The worm-like chain model is a simple continuum model for the statistical mechanics of a flexible polymer subject to an external force. We offer a tutorial introduction to it using three approaches. First, we use a mesoscopic view, treating a long polymer (in two dimensions) as though it were made of many groups of correlated links or "clinks," allowing us to calculate its average extension as a function of the external force via scaling arguments. We then provide a standard statistical mechanics approach, obtaining the average extension by two different means: the equipartition theorem and the partition function. Finally, we work in a probabilistic framework, taking advantage of the Gaussian properties of the chain in the large-force limit to improve upon the previous calculations of the average extension.

  12. Simple stochastic model for El Niño with westerly wind bursts

    PubMed Central

    Thual, Sulian; Majda, Andrew J.; Chen, Nan; Stechmann, Samuel N.

    2016-01-01

    Atmospheric wind bursts in the tropics play a key role in the dynamics of the El Niño Southern Oscillation (ENSO). A simple modeling framework is proposed that summarizes this relationship and captures major features of the observational record while remaining physically consistent and amenable to detailed analysis. Within this simple framework, wind burst activity evolves according to a stochastic two-state Markov switching–diffusion process that depends on the strength of the western Pacific warm pool, and is coupled to simple ocean–atmosphere processes that are otherwise deterministic, stable, and linear. A simple model with this parameterization and no additional nonlinearities reproduces a realistic ENSO cycle with intermittent El Niño and La Niña events of varying intensity and strength as well as realistic buildup and shutdown of wind burst activity in the western Pacific. The wind burst activity has a direct causal effect on the ENSO variability: in particular, it intermittently triggers regular El Niño or La Niña events, super El Niño events, or no events at all, which enables the model to capture observed ENSO statistics such as the probability density function and power spectrum of eastern Pacific sea surface temperatures. The present framework provides further theoretical and practical insight on the relationship between wind burst activity and the ENSO. PMID:27573821

  13. On the Stability of Jump-Linear Systems Driven by Finite-State Machines with Markovian Inputs

    NASA Technical Reports Server (NTRS)

    Patilkulkarni, Sudarshan; Herencia-Zapana, Heber; Gray, W. Steven; Gonzalez, Oscar R.

    2004-01-01

    This paper presents two mean-square stability tests for a jump-linear system driven by a finite-state machine with a first-order Markovian input process. The first test is based on conventional Markov jump-linear theory and avoids the use of any higher-order statistics. The second test is developed directly using the higher-order statistics of the machine s output process. The two approaches are illustrated with a simple model for a recoverable computer control system.

  14. THE WATER BALANCE OF THE SUSQUEHANNA RIVER BASIN AND ITS RESPONSE TO CLIMATE CHANGE. (R824995)

    EPA Science Inventory

    Abstract

    Historical precipitation, temperature and streamflow data for the Susquehanna River Basin (SRB) are analyzed with the objective of developing simple statistical and water balance models of streamflow at the watershed's outlet. Annual streamflow is highly corre...

  15. Using multitype branching processes to quantify statistics of disease outbreaks in zoonotic epidemics

    USDA-ARS?s Scientific Manuscript database

    Despite the enormous relevance of zoonotic infections to world-wide public health, and despite much effort in modeling individual zoonoses, a fundamental understanding of the disease dynamics and the nature of outbreaks emanating from such a complex system is still lacking. We introduce a simple sto...

  16. A new statistical method for transfer coefficient calculations in the framework of the general multiple-compartment model of transport for radionuclides in biological systems.

    PubMed

    Garcia, F; Arruda-Neto, J D; Manso, M V; Helene, O M; Vanin, V R; Rodriguez, O; Mesa, J; Likhachev, V P; Filho, J W; Deppman, A; Perez, G; Guzman, F; de Camargo, S P

    1999-10-01

    A new and simple statistical procedure (STATFLUX) for the calculation of transfer coefficients of radionuclide transport to animals and plants is proposed. The method is based on the general multiple-compartment model, which uses a system of linear equations involving geometrical volume considerations. By using experimentally available curves of radionuclide concentrations versus time, for each animal compartment (organs), flow parameters were estimated by employing a least-squares procedure, whose consistency is tested. Some numerical results are presented in order to compare the STATFLUX transfer coefficients with those from other works and experimental data.

  17. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    PubMed Central

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  18. Magnetorotational dynamo chimeras. The missing link to turbulent accretion disk dynamo models?

    NASA Astrophysics Data System (ADS)

    Riols, A.; Rincon, F.; Cossu, C.; Lesur, G.; Ogilvie, G. I.; Longaretti, P.-Y.

    2017-02-01

    In Keplerian accretion disks, turbulence and magnetic fields may be jointly excited through a subcritical dynamo mechanisminvolving magnetorotational instability (MRI). This dynamo may notably contribute to explaining the time-variability of various accreting systems, as high-resolution simulations of MRI dynamo turbulence exhibit statistical self-organization into large-scale cyclic dynamics. However, understanding the physics underlying these statistical states and assessing their exact astrophysical relevance is theoretically challenging. The study of simple periodic nonlinear MRI dynamo solutions has recently proven useful in this respect, and has highlighted the role of turbulent magnetic diffusion in the seeming impossibility of a dynamo at low magnetic Prandtl number (Pm), a common regime in disks. Arguably though, these simple laminar structures may not be fully representative of the complex, statistically self-organized states expected in astrophysical regimes. Here, we aim at closing this seeming discrepancy by reporting the numerical discovery of exactly periodic, yet semi-statistical "chimeral MRI dynamo states" which are the organized outcome of a succession of MRI-unstable, non-axisymmetric dynamical stages of different forms and amplitudes. Interestingly, these states, while reminiscent of the statistical complexity of turbulent simulations, involve the same physical principles as simpler laminar cycles, and their analysis further confirms the theory that subcritical turbulent magnetic diffusion impedes the sustainment of an MRI dynamo at low Pm. Overall, chimera dynamo cycles therefore offer an unprecedented dual physical and statistical perspective on dynamos in rotating shear flows, which may prove useful in devising more accurate, yet intuitive mean-field models of time-dependent turbulent disk dynamos. Movies associated to Fig. 1 are available at http://www.aanda.org

  19. In defence of model-based inference in phylogeography

    PubMed Central

    Beaumont, Mark A.; Nielsen, Rasmus; Robert, Christian; Hey, Jody; Gaggiotti, Oscar; Knowles, Lacey; Estoup, Arnaud; Panchal, Mahesh; Corander, Jukka; Hickerson, Mike; Sisson, Scott A.; Fagundes, Nelson; Chikhi, Lounès; Beerli, Peter; Vitalis, Renaud; Cornuet, Jean-Marie; Huelsenbeck, John; Foll, Matthieu; Yang, Ziheng; Rousset, Francois; Balding, David; Excoffier, Laurent

    2017-01-01

    Recent papers have promoted the view that model-based methods in general, and those based on Approximate Bayesian Computation (ABC) in particular, are flawed in a number of ways, and are therefore inappropriate for the analysis of phylogeographic data. These papers further argue that Nested Clade Phylogeographic Analysis (NCPA) offers the best approach in statistical phylogeography. In order to remove the confusion and misconceptions introduced by these papers, we justify and explain the reasoning behind model-based inference. We argue that ABC is a statistically valid approach, alongside other computational statistical techniques that have been successfully used to infer parameters and compare models in population genetics. We also examine the NCPA method and highlight numerous deficiencies, either when used with single or multiple loci. We further show that the ages of clades are carelessly used to infer ages of demographic events, that these ages are estimated under a simple model of panmixia and population stationarity but are then used under different and unspecified models to test hypotheses, a usage the invalidates these testing procedures. We conclude by encouraging researchers to study and use model-based inference in population genetics. PMID:29284924

  20. A simple white noise analysis of neuronal light responses.

    PubMed

    Chichilnisky, E J

    2001-05-01

    A white noise technique is presented for estimating the response properties of spiking visual system neurons. The technique is simple, robust, efficient and well suited to simultaneous recordings from multiple neurons. It provides a complete and easily interpretable model of light responses even for neurons that display a common form of response nonlinearity that precludes classical linear systems analysis. A theoretical justification of the technique is presented that relies only on elementary linear algebra and statistics. Implementation is described with examples. The technique and the underlying model of neural responses are validated using recordings from retinal ganglion cells, and in principle are applicable to other neurons. Advantages and disadvantages of the technique relative to classical approaches are discussed.

  1. Directional Statistics for Polarization Observations of Individual Pulses from Radio Pulsars

    NASA Astrophysics Data System (ADS)

    McKinnon, M. M.

    2010-10-01

    Radio polarimetry is a three-dimensional statistical problem. The three-dimensional aspect of the problem arises from the Stokes parameters Q, U, and V, which completely describe the polarization of electromagnetic radiation and conceptually define the orientation of a polarization vector in the Poincaré sphere. The statistical aspect of the problem arises from the random fluctuations in the source-intrinsic polarization and the instrumental noise. A simple model for the polarization of pulsar radio emission has been used to derive the three-dimensional statistics of radio polarimetry. The model is based upon the proposition that the observed polarization is due to the incoherent superposition of two, highly polarized, orthogonal modes. The directional statistics derived from the model follow the Bingham-Mardia and Fisher family of distributions. The model assumptions are supported by the qualitative agreement between the statistics derived from it and those measured with polarization observations of the individual pulses from pulsars. The orthogonal modes are thought to be the natural modes of radio wave propagation in the pulsar magnetosphere. The intensities of the modes become statistically independent when generalized Faraday rotation (GFR) in the magnetosphere causes the difference in their phases to be large. A stochastic version of GFR occurs when fluctuations in the phase difference are also large, and may be responsible for the more complicated polarization patterns observed in pulsar radio emission.

  2. Statistics for X-chromosome associations.

    PubMed

    Özbek, Umut; Lin, Hui-Min; Lin, Yan; Weeks, Daniel E; Chen, Wei; Shaffer, John R; Purcell, Shaun M; Feingold, Eleanor

    2018-06-13

    In a genome-wide association study (GWAS), association between genotype and phenotype at autosomal loci is generally tested by regression models. However, X-chromosome data are often excluded from published analyses of autosomes because of the difference between males and females in number of X chromosomes. Failure to analyze X-chromosome data at all is obviously less than ideal, and can lead to missed discoveries. Even when X-chromosome data are included, they are often analyzed with suboptimal statistics. Several mathematically sensible statistics for X-chromosome association have been proposed. The optimality of these statistics, however, is based on very specific simple genetic models. In addition, while previous simulation studies of these statistics have been informative, they have focused on single-marker tests and have not considered the types of error that occur even under the null hypothesis when the entire X chromosome is scanned. In this study, we comprehensively tested several X-chromosome association statistics using simulation studies that include the entire chromosome. We also considered a wide range of trait models for sex differences and phenotypic effects of X inactivation. We found that models that do not incorporate a sex effect can have large type I error in some cases. We also found that many of the best statistics perform well even when there are modest deviations, such as trait variance differences between the sexes or small sex differences in allele frequencies, from assumptions. © 2018 WILEY PERIODICALS, INC.

  3. [Comparison of simple pooling and bivariate model used in meta-analyses of diagnostic test accuracy published in Chinese journals].

    PubMed

    Huang, Yuan-sheng; Yang, Zhi-rong; Zhan, Si-yan

    2015-06-18

    To investigate the use of simple pooling and bivariate model in meta-analyses of diagnostic test accuracy (DTA) published in Chinese journals (January to November, 2014), compare the differences of results from these two models, and explore the impact of between-study variability of sensitivity and specificity on the differences. DTA meta-analyses were searched through Chinese Biomedical Literature Database (January to November, 2014). Details in models and data for fourfold table were extracted. Descriptive analysis was conducted to investigate the prevalence of the use of simple pooling method and bivariate model in the included literature. Data were re-analyzed with the two models respectively. Differences in the results were examined by Wilcoxon signed rank test. How the results differences were affected by between-study variability of sensitivity and specificity, expressed by I2, was explored. The 55 systematic reviews, containing 58 DTA meta-analyses, were included and 25 DTA meta-analyses were eligible for re-analysis. Simple pooling was used in 50 (90.9%) systematic reviews and bivariate model in 1 (1.8%). The remaining 4 (7.3%) articles used other models pooling sensitivity and specificity or pooled neither of them. Of the reviews simply pooling sensitivity and specificity, 41(82.0%) were at the risk of wrongly using Meta-disc software. The differences in medians of sensitivity and specificity between two models were both 0.011 (P<0.001, P=0.031 respectively). Greater differences could be found as I2 of sensitivity or specificity became larger, especially when I2>75%. Most DTA meta-analyses published in Chinese journals(January to November, 2014) combine the sensitivity and specificity by simple pooling. Meta-disc software can pool the sensitivity and specificity only through fixed-effect model, but a high proportion of authors think it can implement random-effect model. Simple pooling tends to underestimate the results compared with bivariate model. The greater the between-study variance is, the more likely the simple pooling has larger deviation. It is necessary to increase the knowledge level of statistical methods and software for meta-analyses of DTA data.

  4. Modeling Cross-Situational Word–Referent Learning: Prior Questions

    PubMed Central

    Yu, Chen; Smith, Linda B.

    2013-01-01

    Both adults and young children possess powerful statistical computation capabilities—they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of associative learning. This article describes a series of simulation studies and analyses designed to understand the different learning mechanisms posited by the 2 classes of models and their relation to each other. Variants of a hypothesis-testing model and a simple or dumb associative mechanism were examined under different specifications of information selection, computation, and decision. Critically, these 3 components of the models interact in complex ways. The models illustrate a fundamental tradeoff between amount of data input and powerful computations: With the selection of more information, dumb associative models can mimic the powerful learning that is accomplished by hypothesis-testing models with fewer data. However, because of the interactions among the component parts of the models, the associative model can mimic various hypothesis-testing models, producing the same learning patterns but through different internal components. The simulations argue for the importance of a compositional approach to human statistical learning: the experimental decomposition of the processes that contribute to statistical learning in human learners and models with the internal components that can be evaluated independently and together. PMID:22229490

  5. Monte Carlo investigation of transient acoustic fields in partially or completely bounded medium. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Thanedar, B. D.

    1972-01-01

    A simple repetitive calculation was used to investigate what happens to the field in terms of the signal paths of disturbances originating from the energy source. The computation allowed the field to be reconstructed as a function of space and time on a statistical basis. The suggested Monte Carlo method is in response to the need for a numerical method to supplement analytical methods of solution which are only valid when the boundaries have simple shapes, rather than for a medium that is bounded. For the analysis, a suitable model was created from which was developed an algorithm for the estimation of acoustic pressure variations in the region under investigation. The validity of the technique was demonstrated by analysis of simple physical models with the aid of a digital computer. The Monte Carlo method is applicable to a medium which is homogeneous and is enclosed by either rectangular or curved boundaries.

  6. A novel risk score model for prediction of contrast-induced nephropathy after emergent percutaneous coronary intervention.

    PubMed

    Lin, Kai-Yang; Zheng, Wei-Ping; Bei, Wei-Jie; Chen, Shi-Qun; Islam, Sheikh Mohammed Shariful; Liu, Yong; Xue, Lin; Tan, Ning; Chen, Ji-Yan

    2017-03-01

    A few studies developed simple risk model for predicting CIN with poor prognosis after emergent PCI. The study aimed to develop and validate a novel tool for predicting the risk of contrast-induced nephropathy (CIN) in patients undergoing emergent percutaneous coronary intervention (PCI). 692 consecutive patients undergoing emergent PCI between January 2010 and December 2013 were randomly (2:1) assigned to a development dataset (n=461) and a validation dataset (n=231). Multivariate logistic regression was applied to identify independent predictors of CIN, and established CIN predicting model, whose prognostic accuracy was assessed using the c-statistic for discrimination and the Hosmere Lemeshow test for calibration. The overall incidence of CIN was 55(7.9%). A total of 11 variables were analyzed, including age >75years old, baseline serum creatinine (SCr)>1.5mg/dl, hypotension and the use of intra-aortic balloon pump(IABP), which were identified to enter risk score model (Chen). The incidence of CIN was 32(6.9%) in the development dataset (in low risk (score=0), 1.0%, moderate risk (score:1-2), 13.4%, high risk (score≥3), 90.0%). Compared to the classical Mehran's and ACEF CIN risk score models, the risk score (Chen) across the subgroup of the study population exhibited similar discrimination and predictive ability on CIN (c-statistic:0.828, 0.776, 0.853, respectively), in-hospital mortality, 2, 3-years mortality (c-statistic:0.738.0.750, 0.845, respectively) in the validation population. Our data showed that this simple risk model exhibited good discrimination and predictive ability on CIN, similar to Mehran's and ACEF score, and even on long-term mortality after emergent PCI. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  7. Statistical methodologies for the control of dynamic remapping

    NASA Technical Reports Server (NTRS)

    Saltz, J. H.; Nicol, D. M.

    1986-01-01

    Following an initial mapping of a problem onto a multiprocessor machine or computer network, system performance often deteriorates with time. In order to maintain high performance, it may be necessary to remap the problem. The decision to remap must take into account measurements of performance deterioration, the cost of remapping, and the estimated benefits achieved by remapping. We examine the tradeoff between the costs and the benefits of remapping two qualitatively different kinds of problems. One problem assumes that performance deteriorates gradually, the other assumes that performance deteriorates suddenly. We consider a variety of policies for governing when to remap. In order to evaluate these policies, statistical models of problem behaviors are developed. Simulation results are presented which compare simple policies with computationally expensive optimal decision policies; these results demonstrate that for each problem type, the proposed simple policies are effective and robust.

  8. Simple Statistical Model to Quantify Maximum Expected EMC in Spacecraft and Avionics Boxes

    NASA Technical Reports Server (NTRS)

    Trout, Dawn H.; Bremner, Paul

    2014-01-01

    This study shows cumulative distribution function (CDF) comparisons of composite a fairing electromagnetic field data obtained by computational electromagnetic 3D full wave modeling and laboratory testing. Test and model data correlation is shown. In addition, this presentation shows application of the power balance and extention of this method to predict the variance and maximum exptected mean of the E-field data. This is valuable for large scale evaluations of transmission inside cavities.

  9. Review of Statistical Methods for Analysing Healthcare Resources and Costs

    PubMed Central

    Mihaylova, Borislava; Briggs, Andrew; O'Hagan, Anthony; Thompson, Simon G

    2011-01-01

    We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20799344

  10. Strategies for Reduced-Order Models in Uncertainty Quantification of Complex Turbulent Dynamical Systems

    NASA Astrophysics Data System (ADS)

    Qi, Di

    Turbulent dynamical systems are ubiquitous in science and engineering. Uncertainty quantification (UQ) in turbulent dynamical systems is a grand challenge where the goal is to obtain statistical estimates for key physical quantities. In the development of a proper UQ scheme for systems characterized by both a high-dimensional phase space and a large number of instabilities, significant model errors compared with the true natural signal are always unavoidable due to both the imperfect understanding of the underlying physical processes and the limited computational resources available. One central issue in contemporary research is the development of a systematic methodology for reduced order models that can recover the crucial features both with model fidelity in statistical equilibrium and with model sensitivity in response to perturbations. In the first part, we discuss a general mathematical framework to construct statistically accurate reduced-order models that have skill in capturing the statistical variability in the principal directions of a general class of complex systems with quadratic nonlinearity. A systematic hierarchy of simple statistical closure schemes, which are built through new global statistical energy conservation principles combined with statistical equilibrium fidelity, are designed and tested for UQ of these problems. Second, the capacity of imperfect low-order stochastic approximations to model extreme events in a passive scalar field advected by turbulent flows is investigated. The effects in complicated flow systems are considered including strong nonlinear and non-Gaussian interactions, and much simpler and cheaper imperfect models with model error are constructed to capture the crucial statistical features in the stationary tracer field. Several mathematical ideas are introduced to improve the prediction skill of the imperfect reduced-order models. Most importantly, empirical information theory and statistical linear response theory are applied in the training phase for calibrating model errors to achieve optimal imperfect model parameters; and total statistical energy dynamics are introduced to improve the model sensitivity in the prediction phase especially when strong external perturbations are exerted. The validity of reduced-order models for predicting statistical responses and intermittency is demonstrated on a series of instructive models with increasing complexity, including the stochastic triad model, the Lorenz '96 model, and models for barotropic and baroclinic turbulence. The skillful low-order modeling methods developed here should also be useful for other applications such as efficient algorithms for data assimilation.

  11. A complete sample of double-lobed radio quasars for VLBI tests of source models - Definition and statistics

    NASA Technical Reports Server (NTRS)

    Hough, D. H.; Readhead, A. C. S.

    1989-01-01

    A complete, flux-density-limited sample of double-lobed radio quasars is defined, with nuclei bright enough to be mapped with the Mark III VLBI system. It is shown that the statistics of linear size, nuclear strength, and curvature are consistent with the assumption of random source orientations and simple relativistic beaming in the nuclei. However, these statistics are also consistent with the effects of interaction between the beams and the surrounding medium. The distribution of jet velocities in the nuclei, as measured with VLBI, will provide a powerful test of physical theories of extragalactic radio sources.

  12. Ballistic and diffusive dynamics in a two-dimensional ideal gas of macroscopic chaotic Faraday waves.

    PubMed

    Welch, Kyle J; Hastings-Hauss, Isaac; Parthasarathy, Raghuveer; Corwin, Eric I

    2014-04-01

    We have constructed a macroscopic driven system of chaotic Faraday waves whose statistical mechanics, we find, are surprisingly simple, mimicking those of a thermal gas. We use real-time tracking of a single floating probe, energy equipartition, and the Stokes-Einstein relation to define and measure a pseudotemperature and diffusion constant and then self-consistently determine a coefficient of viscous friction for a test particle in this pseudothermal gas. Because of its simplicity, this system can serve as a model for direct experimental investigation of nonequilibrium statistical mechanics, much as the ideal gas epitomizes equilibrium statistical mechanics.

  13. Oscillatory dynamics of investment and capacity utilization

    NASA Astrophysics Data System (ADS)

    Greenblatt, R. E.

    2017-01-01

    Capitalist economic systems display a wide variety of oscillatory phenomena whose underlying causes are often not well understood. In this paper, I consider a very simple model of the reciprocal interaction between investment, capacity utilization, and their time derivatives. The model, which gives rise periodic oscillations, predicts qualitatively the phase relations between these variables. These predictions are observed to be consistent in a statistical sense with econometric data from the US economy.

  14. Probability distributions of molecular observables computed from Markov models. II. Uncertainties in observables and their time-evolution

    NASA Astrophysics Data System (ADS)

    Chodera, John D.; Noé, Frank

    2010-09-01

    Discrete-state Markov (or master equation) models provide a useful simplified representation for characterizing the long-time statistical evolution of biomolecules in a manner that allows direct comparison with experiments as well as the elucidation of mechanistic pathways for an inherently stochastic process. A vital part of meaningful comparison with experiment is the characterization of the statistical uncertainty in the predicted experimental measurement, which may take the form of an equilibrium measurement of some spectroscopic signal, the time-evolution of this signal following a perturbation, or the observation of some statistic (such as the correlation function) of the equilibrium dynamics of a single molecule. Without meaningful error bars (which arise from both approximation and statistical error), there is no way to determine whether the deviations between model and experiment are statistically meaningful. Previous work has demonstrated that a Bayesian method that enforces microscopic reversibility can be used to characterize the statistical component of correlated uncertainties in state-to-state transition probabilities (and functions thereof) for a model inferred from molecular simulation data. Here, we extend this approach to include the uncertainty in observables that are functions of molecular conformation (such as surrogate spectroscopic signals) characterizing each state, permitting the full statistical uncertainty in computed spectroscopic experiments to be assessed. We test the approach in a simple model system to demonstrate that the computed uncertainties provide a useful indicator of statistical variation, and then apply it to the computation of the fluorescence autocorrelation function measured for a dye-labeled peptide previously studied by both experiment and simulation.

  15. Landscape movements of Anopheles gambiae malaria vector mosquitoes in rural Gambia.

    PubMed

    Thomas, Christopher J; Cross, Dónall E; Bøgh, Claus

    2013-01-01

    For malaria control in Africa it is crucial to characterise the dispersal of its most efficient vector, Anopheles gambiae, in order to target interventions and assess their impact spatially. Our study is, we believe, the first to present a statistical model of dispersal probability against distance from breeding habitat to human settlements for this important disease vector. We undertook post-hoc analyses of mosquito catches made in The Gambia to derive statistical dispersal functions for An. gambiae sensu lato collected in 48 villages at varying distances to alluvial larval habitat along the River Gambia. The proportion dispersing declined exponentially with distance, and we estimated that 90% of movements were within 1.7 km. Although a 'heavy-tailed' distribution is considered biologically more plausible due to active dispersal by mosquitoes seeking blood meals, there was no statistical basis for choosing it over a negative exponential distribution. Using a simple random walk model with daily survival and movements previously recorded in Burkina Faso, we were able to reproduce the dispersal probabilities observed in The Gambia. Our results provide an important quantification of the probability of An. gambiae s.l. dispersal in a rural African setting typical of many parts of the continent. However, dispersal will be landscape specific and in order to generalise to other spatial configurations of habitat and hosts it will be necessary to produce tractable models of mosquito movements for operational use. We show that simple random walk models have potential. Consequently, there is a pressing need for new empirical studies of An. gambiae survival and movements in different settings to drive this development.

  16. A critique of Rasch residual fit statistics.

    PubMed

    Karabatsos, G

    2000-01-01

    In test analysis involving the Rasch model, a large degree of importance is placed on the "objective" measurement of individual abilities and item difficulties. The degree to which the objectivity properties are attained, of course, depends on the degree to which the data fit the Rasch model. It is therefore important to utilize fit statistics that accurately and reliably detect the person-item response inconsistencies that threaten the measurement objectivity of persons and items. Given this argument, it is somewhat surprising that there is far more emphasis placed in the objective measurement of person and items than there is in the measurement quality of Rasch fit statistics. This paper provides a critical analysis of the residual fit statistics of the Rasch model, arguably the most often used fit statistics, in an effort to illustrate that the task of Rasch fit analysis is not as simple and straightforward as it appears to be. The faulty statistical properties of the residual fit statistics do not allow either a convenient or a straightforward approach to Rasch fit analysis. For instance, given a residual fit statistic, the use of a single minimum critical value for misfit diagnosis across different testing situations, where the situations vary in sample and test properties, leads to both the overdetection and underdetection of misfit. To improve this situation, it is argued that psychometricians need to implement residual-free Rasch fit statistics that are based on the number of Guttman response errors, or use indices that are statistically optimal in detecting measurement disturbances.

  17. Modeling epidemics on adaptively evolving networks: A data-mining perspective.

    PubMed

    Kattis, Assimakis A; Holiday, Alexander; Stoica, Ana-Andreea; Kevrekidis, Ioannis G

    2016-01-01

    The exploration of epidemic dynamics on dynamically evolving ("adaptive") networks poses nontrivial challenges to the modeler, such as the determination of a small number of informative statistics of the detailed network state (that is, a few "good observables") that usefully summarize the overall (macroscopic, systems-level) behavior. Obtaining reduced, small size accurate models in terms of these few statistical observables--that is, trying to coarse-grain the full network epidemic model to a small but useful macroscopic one--is even more daunting. Here we describe a data-based approach to solving the first challenge: the detection of a few informative collective observables of the detailed epidemic dynamics. This is accomplished through Diffusion Maps (DMAPS), a recently developed data-mining technique. We illustrate the approach through simulations of a simple mathematical model of epidemics on a network: a model known to exhibit complex temporal dynamics. We discuss potential extensions of the approach, as well as possible shortcomings.

  18. The non-equilibrium statistical mechanics of a simple geophysical fluid dynamics model

    NASA Astrophysics Data System (ADS)

    Verkley, Wim; Severijns, Camiel

    2014-05-01

    Lorenz [1] has devised a dynamical system that has proved to be very useful as a benchmark system in geophysical fluid dynamics. The system in its simplest form consists of a periodic array of variables that can be associated with an atmospheric field on a latitude circle. The system is driven by a constant forcing, is damped by linear friction and has a simple advection term that causes the model to behave chaotically if the forcing is large enough. Our aim is to predict the statistics of Lorenz' model on the basis of a given average value of its total energy - obtained from a numerical integration - and the assumption of statistical stationarity. Our method is the principle of maximum entropy [2] which in this case reads: the information entropy of the system's probability density function shall be maximal under the constraints of normalization, a given value of the average total energy and statistical stationarity. Statistical stationarity is incorporated approximately by using `stationarity constraints', i.e., by requiring that the average first and possibly higher-order time-derivatives of the energy are zero in the maximization of entropy. The analysis [3] reveals that, if the first stationarity constraint is used, the resulting probability density function rather accurately reproduces the statistics of the individual variables. If the second stationarity constraint is used as well, the correlations between the variables are also reproduced quite adequately. The method can be generalized straightforwardly and holds the promise of a viable non-equilibrium statistical mechanics of the forced-dissipative systems of geophysical fluid dynamics. [1] E.N. Lorenz, 1996: Predictability - A problem partly solved, in Proc. Seminar on Predictability (ECMWF, Reading, Berkshire, UK), Vol. 1, pp. 1-18. [2] E.T. Jaynes, 2003: Probability Theory - The Logic of Science (Cambridge University Press, Cambridge). [3] W.T.M. Verkley and C.A. Severijns, 2014: The maximum entropy principle applied to a dynamical system proposed by Lorenz, Eur. Phys. J. B, 87:7, http://dx.doi.org/10.1140/epjb/e2013-40681-2 (open access).

  19. A Simple Illustration for the Need of Multiple Comparison Procedures

    ERIC Educational Resources Information Center

    Carter, Rickey E.

    2010-01-01

    Statistical adjustments to accommodate multiple comparisons are routinely covered in introductory statistical courses. The fundamental rationale for such adjustments, however, may not be readily understood. This article presents a simple illustration to help remedy this.

  20. Prediction of Solution Properties of Flexible-Chain Polymers: A Computer Simulation Undergraduate Experiment

    ERIC Educational Resources Information Center

    de la Torre, Jose Garcia; Cifre, Jose G. Hernandez; Martinez, M. Carmen Lopez

    2008-01-01

    This paper describes a computational exercise at undergraduate level that demonstrates the employment of Monte Carlo simulation to study the conformational statistics of flexible polymer chains, and to predict solution properties. Three simple chain models, including excluded volume interactions, have been implemented in a public-domain computer…

  1. A Lattice Boltzmann Method for Turbomachinery Simulations

    NASA Technical Reports Server (NTRS)

    Hsu, A. T.; Lopez, I.

    2003-01-01

    Lattice Boltzmann (LB) Method is a relatively new method for flow simulations. The start point of LB method is statistic mechanics and Boltzmann equation. The LB method tries to set up its model at molecular scale and simulate the flow at macroscopic scale. LBM has been applied to mostly incompressible flows and simple geometry.

  2. Estimating annual bole biomass production using uncertainty analysis

    Treesearch

    Travis J. Woolley; Mark E. Harmon; Kari B. O' Connell

    2007-01-01

    Two common sampling methodologies coupled with a simple statistical model were evaluated to determine the accuracy and precision of annual bole biomass production (BBP) and inter-annual variability estimates using this type of approach. We performed an uncertainty analysis using Monte Carlo methods in conjunction with radial growth core data from trees in three Douglas...

  3. Statistical Aspects of Point Count Sampling

    Treesearch

    Richard J. Barker; John R. Sauer

    1995-01-01

    The dominant feature of point counts is that they do not census birds, but instead provide incomplete counts of individuals present within a survey plot. Considering a simple model for point count sampling, we demonstrate that use of these incomplete counts can bias estimators and testing procedures, leading to inappropriate conclusions. A large portion of the...

  4. Methods for Generating Complex Networks with Selected Structural Properties for Simulations: A Review and Tutorial for Neuroscientists

    PubMed Central

    Prettejohn, Brenton J.; Berryman, Matthew J.; McDonnell, Mark D.

    2011-01-01

    Many simulations of networks in computational neuroscience assume completely homogenous random networks of the Erdös–Rényi type, or regular networks, despite it being recognized for some time that anatomical brain networks are more complex in their connectivity and can, for example, exhibit the “scale-free” and “small-world” properties. We review the most well known algorithms for constructing networks with given non-homogeneous statistical properties and provide simple pseudo-code for reproducing such networks in software simulations. We also review some useful mathematical results and approximations associated with the statistics that describe these network models, including degree distribution, average path length, and clustering coefficient. We demonstrate how such results can be used as partial verification and validation of implementations. Finally, we discuss a sometimes overlooked modeling choice that can be crucially important for the properties of simulated networks: that of network directedness. The most well known network algorithms produce undirected networks, and we emphasize this point by highlighting how simple adaptations can instead produce directed networks. PMID:21441986

  5. MyPMFs: a simple tool for creating statistical potentials to assess protein structural models.

    PubMed

    Postic, Guillaume; Hamelryck, Thomas; Chomilier, Jacques; Stratmann, Dirk

    2018-05-29

    Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs. Copyright © 2018. Published by Elsevier B.V.

  6. Quantifying predictability in a model with statistical features of the atmosphere

    PubMed Central

    Kleeman, Richard; Majda, Andrew J.; Timofeyev, Ilya

    2002-01-01

    The Galerkin truncated inviscid Burgers equation has recently been shown by the authors to be a simple model with many degrees of freedom, with many statistical properties similar to those occurring in dynamical systems relevant to the atmosphere. These properties include long time-correlated, large-scale modes of low frequency variability and short time-correlated “weather modes” at smaller scales. The correlation scaling in the model extends over several decades and may be explained by a simple theory. Here a thorough analysis of the nature of predictability in the idealized system is developed by using a theoretical framework developed by R.K. This analysis is based on a relative entropy functional that has been shown elsewhere by one of the authors to measure the utility of statistical predictions precisely. The analysis is facilitated by the fact that most relevant probability distributions are approximately Gaussian if the initial conditions are assumed to be so. Rather surprisingly this holds for both the equilibrium (climatological) and nonequilibrium (prediction) distributions. We find that in most cases the absolute difference in the first moments of these two distributions (the “signal” component) is the main determinant of predictive utility variations. Contrary to conventional belief in the ensemble prediction area, the dispersion of prediction ensembles is generally of secondary importance in accounting for variations in utility associated with different initial conditions. This conclusion has potentially important implications for practical weather prediction, where traditionally most attention has focused on dispersion and its variability. PMID:12429863

  7. Permutation glass.

    PubMed

    Williams, Mobolaji

    2018-01-01

    The field of disordered systems in statistical physics provides many simple models in which the competing influences of thermal and nonthermal disorder lead to new phases and nontrivial thermal behavior of order parameters. In this paper, we add a model to the subject by considering a disordered system where the state space consists of various orderings of a list. As in spin glasses, the disorder of such "permutation glasses" arises from a parameter in the Hamiltonian being drawn from a distribution of possible values, thus allowing nominally "incorrect orderings" to have lower energies than "correct orderings" in the space of permutations. We analyze a Gaussian, uniform, and symmetric Bernoulli distribution of energy costs, and, by employing Jensen's inequality, derive a simple condition requiring the permutation glass to always transition to the correctly ordered state at a temperature lower than that of the nondisordered system, provided that this correctly ordered state is accessible. We in turn find that in order for the correctly ordered state to be accessible, the probability that an incorrectly ordered component is energetically favored must be less than the inverse of the number of components in the system. We show that all of these results are consistent with a replica symmetric ansatz of the system. We conclude by arguing that there is no distinct permutation glass phase for the simplest model considered here and by discussing how to extend the analysis to more complex Hamiltonians capable of novel phase behavior and replica symmetry breaking. Finally, we outline an apparent correspondence between the presented system and a discrete-energy-level fermion gas. In all, the investigation introduces a class of exactly soluble models into statistical mechanics and provides a fertile ground to investigate statistical models of disorder.

  8. Appplication of statistical mechanical methods to the modeling of social networks

    NASA Astrophysics Data System (ADS)

    Strathman, Anthony Robert

    With the recent availability of large-scale social data sets, social networks have become open to quantitative analysis via the methods of statistical physics. We examine the statistical properties of a real large-scale social network, generated from cellular phone call-trace logs. We find this network, like many other social networks to be assortative (r = 0.31) and clustered (i.e., strongly transitive, C = 0.21). We measure fluctuation scaling to identify the presence of internal structure in the network and find that structural inhomogeneity effectively disappears at the scale of a few hundred nodes, though there is no sharp cutoff. We introduce an agent-based model of social behavior, designed to model the formation and dissolution of social ties. The model is a modified Metropolis algorithm containing agents operating under the basic sociological constraints of reciprocity, communication need and transitivity. The model introduces the concept of a social temperature. We go on to show that this simple model reproduces the global statistical network features (incl. assortativity, connected fraction, mean degree, clustering, and mean shortest path length) of the real network data and undergoes two phase transitions, one being from a "gas" to a "liquid" state and the second from a liquid to a glassy state as function of this social temperature.

  9. Predicting the Ability of Marine Mammal Populations to Compensate for Behavioral Disturbances

    DTIC Science & Technology

    2015-09-30

    approaches, including simple theoretical models as well as statistical analysis of data rich conditions. Building on models developed for PCoD [2,3], we...conditions is population trajectory most likely to be affected (the central aim of PCoD ). For the revised model presented here, we include a population...averaged condition individuals (here used as a proxy for individual health as defined in PCoD ), and E is the quality of the environment in which the

  10. Reliability Analysis of the Gradual Degradation of Semiconductor Devices.

    DTIC Science & Technology

    1983-07-20

    under the heading of linear models or linear statistical models . 3 ,4 We have not used this material in this report. Assuming catastrophic failure when...assuming a catastrophic model . In this treatment we first modify our system loss formula and then proceed to the actual analysis. II. ANALYSIS OF...Failure Time 1 Ti Ti 2 T2 T2 n Tn n and are easily analyzed by simple linear regression. Since we have assumed a log normal/Arrhenius activation

  11. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  12. Counts-in-cylinders in the Sloan Digital Sky Survey with Comparisons to N-body Simulations

    NASA Astrophysics Data System (ADS)

    Berrier, Heather D.; Barton, Elizabeth J.; Berrier, Joel C.; Bullock, James S.; Zentner, Andrew R.; Wechsler, Risa H.

    2011-01-01

    Environmental statistics provide a necessary means of comparing the properties of galaxies in different environments, and a vital test of models of galaxy formation within the prevailing hierarchical cosmological model. We explore counts-in-cylinders, a common statistic defined as the number of companions of a particular galaxy found within a given projected radius and redshift interval. Galaxy distributions with the same two-point correlation functions do not necessarily have the same companion count distributions. We use this statistic to examine the environments of galaxies in the Sloan Digital Sky Survey Data Release 4 (SDSS DR4). We also make preliminary comparisons to four models for the spatial distributions of galaxies, based on N-body simulations and data from SDSS DR4, to study the utility of the counts-in-cylinders statistic. There is a very large scatter between the number of companions a galaxy has and the mass of its parent dark matter halo and the halo occupation, limiting the utility of this statistic for certain kinds of environmental studies. We also show that prevalent empirical models of galaxy clustering, that match observed two- and three-point clustering statistics well, fail to reproduce some aspects of the observed distribution of counts-in-cylinders on 1, 3, and 6 h -1 Mpc scales. All models that we explore underpredict the fraction of galaxies with few or no companions in 3 and 6 h -1 Mpc cylinders. Roughly 7% of galaxies in the real universe are significantly more isolated within a 6 h -1 Mpc cylinder than the galaxies in any of the models we use. Simple phenomenological models that map galaxies to dark matter halos fail to reproduce high-order clustering statistics in low-density environments.

  13. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  14. Statistical representation of multiphase flow

    NASA Astrophysics Data System (ADS)

    Subramaniam

    2000-11-01

    The relationship between two common statistical representations of multiphase flow, namely, the single--point Eulerian statistical representation of two--phase flow (D. A. Drew, Ann. Rev. Fluid Mech. (15), 1983), and the Lagrangian statistical representation of a spray using the dropet distribution function (F. A. Williams, Phys. Fluids 1 (6), 1958) is established for spherical dispersed--phase elements. This relationship is based on recent work which relates the droplet distribution function to single--droplet pdfs starting from a Liouville description of a spray (Subramaniam, Phys. Fluids 10 (12), 2000). The Eulerian representation, which is based on a random--field model of the flow, is shown to contain different statistical information from the Lagrangian representation, which is based on a point--process model. The two descriptions are shown to be simply related for spherical, monodisperse elements in statistically homogeneous two--phase flow, whereas such a simple relationship is precluded by the inclusion of polydispersity and statistical inhomogeneity. The common origin of these two representations is traced to a more fundamental statistical representation of a multiphase flow, whose concepts derive from a theory for dense sprays recently proposed by Edwards (Atomization and Sprays 10 (3--5), 2000). The issue of what constitutes a minimally complete statistical representation of a multiphase flow is resolved.

  15. A Complementary Note to 'A Lag-1 Smoother Approach to System-Error Estimation': The Intrinsic Limitations of Residual Diagnostics

    NASA Technical Reports Server (NTRS)

    Todling, Ricardo

    2015-01-01

    Recently, this author studied an approach to the estimation of system error based on combining observation residuals derived from a sequential filter and fixed lag-1 smoother. While extending the methodology to a variational formulation, experimenting with simple models and making sure consistency was found between the sequential and variational formulations, the limitations of the residual-based approach came clearly to the surface. This note uses the sequential assimilation application to simple nonlinear dynamics to highlight the issue. Only when some of the underlying error statistics are assumed known is it possible to estimate the unknown component. In general, when considerable uncertainties exist in the underlying statistics as a whole, attempts to obtain separate estimates of the various error covariances are bound to lead to misrepresentation of errors. The conclusions are particularly relevant to present-day attempts to estimate observation-error correlations from observation residual statistics. A brief illustration of the issue is also provided by comparing estimates of error correlations derived from a quasi-operational assimilation system and a corresponding Observing System Simulation Experiments framework.

  16. Causality

    NASA Astrophysics Data System (ADS)

    Pearl, Judea

    2000-03-01

    Written by one of the pre-eminent researchers in the field, this book provides a comprehensive exposition of modern analysis of causation. It shows how causality has grown from a nebulous concept into a mathematical theory with significant applications in the fields of statistics, artificial intelligence, philosophy, cognitive science, and the health and social sciences. Pearl presents a unified account of the probabilistic, manipulative, counterfactual and structural approaches to causation, and devises simple mathematical tools for analyzing the relationships between causal connections, statistical associations, actions and observations. The book will open the way for including causal analysis in the standard curriculum of statistics, artifical intelligence, business, epidemiology, social science and economics. Students in these areas will find natural models, simple identification procedures, and precise mathematical definitions of causal concepts that traditional texts have tended to evade or make unduly complicated. This book will be of interest to professionals and students in a wide variety of fields. Anyone who wishes to elucidate meaningful relationships from data, predict effects of actions and policies, assess explanations of reported events, or form theories of causal understanding and causal speech will find this book stimulating and invaluable.

  17. Statistics of Optical Coherence Tomography Data From Human Retina

    PubMed Central

    de Juan, Joaquín; Ferrone, Claudia; Giannini, Daniela; Huang, David; Koch, Giorgio; Russo, Valentina; Tan, Ou; Bruni, Carlo

    2010-01-01

    Optical coherence tomography (OCT) has recently become one of the primary methods for noninvasive probing of the human retina. The pseudoimage formed by OCT (the so-called B-scan) varies probabilistically across pixels due to complexities in the measurement technique. Hence, sensitive automatic procedures of diagnosis using OCT may exploit statistical analysis of the spatial distribution of reflectance. In this paper, we perform a statistical study of retinal OCT data. We find that the stretched exponential probability density function can model well the distribution of intensities in OCT pseudoimages. Moreover, we show a small, but significant correlation between neighbor pixels when measuring OCT intensities with pixels of about 5 µm. We then develop a simple joint probability model for the OCT data consistent with known retinal features. This model fits well the stretched exponential distribution of intensities and their spatial correlation. In normal retinas, fit parameters of this model are relatively constant along retinal layers, but varies across layers. However, in retinas with diabetic retinopathy, large spikes of parameter modulation interrupt the constancy within layers, exactly where pathologies are visible. We argue that these results give hope for improvement in statistical pathology-detection methods even when the disease is in its early stages. PMID:20304733

  18. Universality classes of fluctuation dynamics in hierarchical complex systems

    NASA Astrophysics Data System (ADS)

    Macêdo, A. M. S.; González, Iván R. Roa; Salazar, D. S. P.; Vasconcelos, G. L.

    2017-03-01

    A unified approach is proposed to describe the statistics of the short-time dynamics of multiscale complex systems. The probability density function of the relevant time series (signal) is represented as a statistical superposition of a large time-scale distribution weighted by the distribution of certain internal variables that characterize the slowly changing background. The dynamics of the background is formulated as a hierarchical stochastic model whose form is derived from simple physical constraints, which in turn restrict the dynamics to only two possible classes. The probability distributions of both the signal and the background have simple representations in terms of Meijer G functions. The two universality classes for the background dynamics manifest themselves in the signal distribution as two types of tails: power law and stretched exponential, respectively. A detailed analysis of empirical data from classical turbulence and financial markets shows excellent agreement with the theory.

  19. Is There a Critical Distance for Fickian Transport? - a Statistical Approach to Sub-Fickian Transport Modelling in Porous Media

    NASA Astrophysics Data System (ADS)

    Most, S.; Nowak, W.; Bijeljic, B.

    2014-12-01

    Transport processes in porous media are frequently simulated as particle movement. This process can be formulated as a stochastic process of particle position increments. At the pore scale, the geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Recent experimental data suggest that we have not yet reached the end of the need to generalize, because particle increments show statistical dependency beyond linear correlation and over many time steps. The goal of this work is to better understand the validity regions of commonly made assumptions. We are investigating after what transport distances can we observe: A statistical dependence between increments, that can be modelled as an order-k Markov process, boils down to order 1. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks would start. A bivariate statistical dependence that simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW). Complete absence of statistical dependence (validity of classical PTRW/CTRW). The approach is to derive a statistical model for pore-scale transport from a powerful experimental data set via copula analysis. The model is formulated as a non-Gaussian, mutually dependent Markov process of higher order, which allows us to investigate the validity ranges of simpler models.

  20. Entropic Repulsion Between Fluctuating Surfaces

    NASA Astrophysics Data System (ADS)

    Janke, W.

    The statistical mechanics of fluctuating surfaces plays an important role in a variety of physical systems, ranging from biological membranes to world sheets of strings in theories of fundamental interactions. In many applications it is a good approximation to assume that the surfaces possess no tension. Their statistical properties are then governed by curvature energies only, which allow for gigantic out-of-plane undulations. These fluctuations are the “entropic” origin of long-range repulsive forces in layered surface systems. Theoretical estimates of these forces for simple model surfaces are surveyed and compared with recent Monte Carlo simulations.

  1. Damage and strength of composite materials: Trends, predictions, and challenges

    NASA Technical Reports Server (NTRS)

    Obrien, T. Kevin

    1994-01-01

    Research on damage mechanisms and ultimate strength of composite materials relevant to scaling issues will be addressed in this viewgraph presentation. The use of fracture mechanics and Weibull statistics to predict scaling effects for the onset of isolated damage mechanisms will be highlighted. The ability of simple fracture mechanics models to predict trends that are useful in parametric or preliminary designs studies will be reviewed. The limitations of these simple models for complex loading conditions will also be noted. The difficulty in developing generic criteria for the growth of these mechanisms needed in progressive damage models to predict strength will be addressed. A specific example for a problem where failure is a direct consequence of progressive delamination will be explored. A damage threshold/fail-safety concept for addressing composite damage tolerance will be discussed.

  2. Statistics of Shared Components in Complex Component Systems

    NASA Astrophysics Data System (ADS)

    Mazzolini, Andrea; Gherardi, Marco; Caselle, Michele; Cosentino Lagomarsino, Marco; Osella, Matteo

    2018-04-01

    Many complex systems are modular. Such systems can be represented as "component systems," i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori, and their detection is often a challenge of both scientific and practical importance, requiring a clear understanding of component statistics. Importantly, some quantitative invariants appear to be common to many component systems, most notably a common broad distribution of component abundances, which often resembles the well-known Zipf's law. Such "laws" affect in a general and nontrivial way the component statistics, potentially hindering the identification of system-specific functional constraints or generative processes. Here, we specifically focus on the statistics of shared components, i.e., the distribution of the number of components shared by different system realizations, such as the common bricks found in different LEGO sets. To account for the effects of component heterogeneity, we consider a simple null model, which builds system realizations by random draws from a universe of possible components. Under general assumptions on abundance heterogeneity, we provide analytical estimates of component occurrence, which quantify exhaustively the statistics of shared components. Surprisingly, this simple null model can positively explain important features of empirical component-occurrence distributions obtained from large-scale data on bacterial genomes, LEGO sets, and book chapters. Specific architectural features and functional constraints can be detected from occurrence patterns as deviations from these null predictions, as we show for the illustrative case of the "core" genome in bacteria.

  3. Does solar activity affect human happiness?

    NASA Astrophysics Data System (ADS)

    Kristoufek, Ladislav

    2018-03-01

    We investigate the direct influence of solar activity (represented by sunspot numbers) on human happiness (represented by the Twitter-based Happiness Index). We construct four models controlling for various statistical and dynamic effects of the analyzed series. The final model gives promising results. First, there is a statistically significant negative influence of solar activity on happiness which holds even after controlling for the other factors. Second, the final model, which is still rather simple, explains around 75% of variance of the Happiness Index. Third, our control variables contribute significantly as well: happiness is higher in no sunspots days, happiness is strongly persistent, there are strong intra-week cycles and happiness peaks during holidays. Our results strongly contribute to the topical literature and they provide evidence of unique utility of the online data.

  4. Vehicle track segmentation using higher order random fields

    DOE PAGES

    Quach, Tu -Thach

    2017-01-09

    Here, we present an approach to segment vehicle tracks in coherent change detection images, a product of combining two synthetic aperture radar images taken at different times. The approach uses multiscale higher order random field models to capture track statistics, such as curvatures and their parallel nature, that are not currently utilized in existing methods. These statistics are encoded as 3-by-3 patterns at different scales. The model can complete disconnected tracks often caused by sensor noise and various environmental effects. Coupling the model with a simple classifier, our approach is effective at segmenting salient tracks. We improve the F-measure onmore » a standard vehicle track data set to 0.963, up from 0.897 obtained by the current state-of-the-art method.« less

  5. Endogenous time-varying risk aversion and asset returns.

    PubMed

    Berardi, Michele

    2016-01-01

    Stylized facts about statistical properties for short horizon returns in financial markets have been identified in the literature, but a satisfactory understanding for their manifestation is yet to be achieved. In this work, we show that a simple asset pricing model with representative agent is able to generate time series of returns that replicate such stylized facts if the risk aversion coefficient is allowed to change endogenously over time in response to unexpected excess returns under evolutionary forces. The same model, under constant risk aversion, would instead generate returns that are essentially Gaussian. We conclude that an endogenous time-varying risk aversion represents a very parsimonious way to make the model match real data on key statistical properties, and therefore deserves careful consideration from economists and practitioners alike.

  6. Vehicle track segmentation using higher order random fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Quach, Tu -Thach

    Here, we present an approach to segment vehicle tracks in coherent change detection images, a product of combining two synthetic aperture radar images taken at different times. The approach uses multiscale higher order random field models to capture track statistics, such as curvatures and their parallel nature, that are not currently utilized in existing methods. These statistics are encoded as 3-by-3 patterns at different scales. The model can complete disconnected tracks often caused by sensor noise and various environmental effects. Coupling the model with a simple classifier, our approach is effective at segmenting salient tracks. We improve the F-measure onmore » a standard vehicle track data set to 0.963, up from 0.897 obtained by the current state-of-the-art method.« less

  7. Results of the Sea Ice Model Intercomparison Project: Evaluation of sea ice rheology schemes for use in climate simulations

    NASA Astrophysics Data System (ADS)

    Kreyscher, Martin; Harder, Markus; Lemke, Peter; Flato, Gregory M.

    2000-05-01

    A hierarchy of sea ice rheologies is evaluated on the basis of a comprehensive set of observational data. The investigations are part of the Sea Ice Model Intercomparison Project (SIMIP). Four different sea ice rheology schemes are compared: a viscous-plastic rheology, a cavitating-fluid model, a compressible Newtonian fluid, and a simple free drift approach with velocity correction. The same grid, land boundaries, and forcing fields are applied to all models. As verification data, there are (1) ice thickness data from upward looking sonars (ULS), (2) ice concentration data from the passive microwave radiometers SMMR and SSM/I, (3) daily buoy drift data obtained by the International Arctic Buoy Program (IABP), and (4) satellite-derived ice drift fields based on the 85 GHz channel of SSM/I. All models are optimized individually with respect to mean drift speed and daily drift speed statistics. The impact of ice strength on the ice cover is best revealed by the spatial pattern of ice thickness, ice drift on different timescales, daily drift speed statistics, and the drift velocities in Fram Strait. Overall, the viscous-plastic rheology yields the most realistic simulation. In contrast, the results of the very simple free-drift model with velocity correction clearly show large errors in simulated ice drift as well as in ice thicknesses and ice export through Fram Strait compared to observation. The compressible Newtonian fluid cannot prevent excessive ice thickness buildup in the central Arctic and overestimates the internal forces in Fram Strait. Because of the lack of shear strength, the cavitating-fluid model shows marked differences to the statistics of observed ice drift and the observed spatial pattern of ice thickness. Comparison of required computer resources demonstrates that the additional cost for the viscous-plastic sea ice rheology is minor compared with the atmospheric and oceanic model components in global climate simulations.

  8. The accuracy of matrix population model projections for coniferous trees in the Sierra Nevada, California

    USGS Publications Warehouse

    van Mantgem, P.J.; Stephenson, N.L.

    2005-01-01

    1 We assess the use of simple, size-based matrix population models for projecting population trends for six coniferous tree species in the Sierra Nevada, California. We used demographic data from 16 673 trees in 15 permanent plots to create 17 separate time-invariant, density-independent population projection models, and determined differences between trends projected from initial surveys with a 5-year interval and observed data during two subsequent 5-year time steps. 2 We detected departures from the assumptions of the matrix modelling approach in terms of strong growth autocorrelations. We also found evidence of observation errors for measurements of tree growth and, to a more limited degree, recruitment. Loglinear analysis provided evidence of significant temporal variation in demographic rates for only two of the 17 populations. 3 Total population sizes were strongly predicted by model projections, although population dynamics were dominated by carryover from the previous 5-year time step (i.e. there were few cases of recruitment or death). Fractional changes to overall population sizes were less well predicted. Compared with a null model and a simple demographic model lacking size structure, matrix model projections were better able to predict total population sizes, although the differences were not statistically significant. Matrix model projections were also able to predict short-term rates of survival, growth and recruitment. Mortality frequencies were not well predicted. 4 Our results suggest that simple size-structured models can accurately project future short-term changes for some tree populations. However, not all populations were well predicted and these simple models would probably become more inaccurate over longer projection intervals. The predictive ability of these models would also be limited by disturbance or other events that destabilize demographic rates. ?? 2005 British Ecological Society.

  9. Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

    USGS Publications Warehouse

    Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

    2009-01-01

    In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.

  10. Evaluating statistical consistency in the ocean model component of the Community Earth System Model (pyCECT v2.0)

    NASA Astrophysics Data System (ADS)

    Baker, Allison H.; Hu, Yong; Hammerling, Dorit M.; Tseng, Yu-heng; Xu, Haiying; Huang, Xiaomeng; Bryan, Frank O.; Yang, Guangwen

    2016-07-01

    The Parallel Ocean Program (POP), the ocean model component of the Community Earth System Model (CESM), is widely used in climate research. Most current work in CESM-POP focuses on improving the model's efficiency or accuracy, such as improving numerical methods, advancing parameterization, porting to new architectures, or increasing parallelism. Since ocean dynamics are chaotic in nature, achieving bit-for-bit (BFB) identical results in ocean solutions cannot be guaranteed for even tiny code modifications, and determining whether modifications are admissible (i.e., statistically consistent with the original results) is non-trivial. In recent work, an ensemble-based statistical approach was shown to work well for software verification (i.e., quality assurance) on atmospheric model data. The general idea of the ensemble-based statistical consistency testing is to use a qualitative measurement of the variability of the ensemble of simulations as a metric with which to compare future simulations and make a determination of statistical distinguishability. The capability to determine consistency without BFB results boosts model confidence and provides the flexibility needed, for example, for more aggressive code optimizations and the use of heterogeneous execution environments. Since ocean and atmosphere models have differing characteristics in term of dynamics, spatial variability, and timescales, we present a new statistical method to evaluate ocean model simulation data that requires the evaluation of ensemble means and deviations in a spatial manner. In particular, the statistical distribution from an ensemble of CESM-POP simulations is used to determine the standard score of any new model solution at each grid point. Then the percentage of points that have scores greater than a specified threshold indicates whether the new model simulation is statistically distinguishable from the ensemble simulations. Both ensemble size and composition are important. Our experiments indicate that the new POP ensemble consistency test (POP-ECT) tool is capable of distinguishing cases that should be statistically consistent with the ensemble and those that should not, as well as providing a simple, subjective and systematic way to detect errors in CESM-POP due to the hardware or software stack, positively contributing to quality assurance for the CESM-POP code.

  11. Statistical distributions of avalanche size and waiting times in an inter-sandpile cascade model

    NASA Astrophysics Data System (ADS)

    Batac, Rene; Longjas, Anthony; Monterola, Christopher

    2012-02-01

    Sandpile-based models have successfully shed light on key features of nonlinear relaxational processes in nature, particularly the occurrence of fat-tailed magnitude distributions and exponential return times, from simple local stress redistributions. In this work, we extend the existing sandpile paradigm into an inter-sandpile cascade, wherein the avalanches emanating from a uniformly-driven sandpile (first layer) is used to trigger the next (second layer), and so on, in a successive fashion. Statistical characterizations reveal that avalanche size distributions evolve from a power-law p(S)≈S-1.3 for the first layer to gamma distributions p(S)≈Sαexp(-S/S0) for layers far away from the uniformly driven sandpile. The resulting avalanche size statistics is found to be associated with the corresponding waiting time distribution, as explained in an accompanying analytic formulation. Interestingly, both the numerical and analytic models show good agreement with actual inventories of non-uniformly driven events in nature.

  12. Langevin modelling of high-frequency Hang-Seng index data

    NASA Astrophysics Data System (ADS)

    Tang, Lei-Han

    2003-06-01

    Accurate statistical characterization of financial time series, such as compound stock indices, foreign currency exchange rates, etc., is fundamental to investment risk management, pricing of derivative products and financial decision making. Traditionally, such data were analyzed and modeled from a purely statistics point of view, with little concern on the specifics of financial markets. Increasingly, however, attention has been paid to the underlying economic forces and the collective behavior of investors. Here we summarize a novel approach to the statistical modeling of a major stock index (the Hang Seng index). Based on mathematical results previously derived in the fluid turbulence literature, we show that a Langevin equation with a variable noise amplitude correctly reproduces the ubiquitous fat tails in the probability distribution of intra-day price moves. The form of the Langevin equation suggests that, despite the extremely complex nature of financial concerns and investment strategies at the individual's level, there exist simple universal rules governing the high-frequency price move in a stock market.

  13. SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

    PubMed

    Chu, Annie; Cui, Jenny; Dinov, Ivo D

    2009-03-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.

  14. A Simple Effect Size Estimator for Single Case Designs Using WinBUGS

    ERIC Educational Resources Information Center

    Rindskopf, David; Shadish, William; Hedges, Larry

    2012-01-01

    Data from single case designs (SCDs) have traditionally been analyzed by visual inspection rather than statistical models. As a consequence, effect sizes have been of little interest. Lately, some effect-size estimators have been proposed, but most are either (i) nonparametric, and/or (ii) based on an analogy incompatible with effect sizes from…

  15. Keeping Things Simple: Why the Human Development Index Should Not Diverge from Its Equal Weights Assumption

    ERIC Educational Resources Information Center

    Stapleton, Lee M.; Garrod, Guy D.

    2007-01-01

    Using a range of statistical criteria rooted in Information Theory we show that there is little justification for relaxing the equal weights assumption underlying the United Nation's Human Development Index (HDI) even if the true HDI diverges significantly from this assumption. Put differently, the additional model complexity that unequal weights…

  16. Benefits of statistical molecular design, covariance analysis, and reference models in QSAR: a case study on acetylcholinesterase

    NASA Astrophysics Data System (ADS)

    Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna

    2015-03-01

    Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.

  17. Health belief model and reasoned action theory in predicting water saving behaviors in yazd, iran.

    PubMed

    Morowatisharifabad, Mohammad Ali; Momayyezi, Mahdieh; Ghaneian, Mohammad Taghi

    2012-01-01

    People's behaviors and intentions about healthy behaviors depend on their beliefs, values, and knowledge about the issue. Various models of health education are used in deter¬mining predictors of different healthy behaviors but their efficacy in cultural behaviors, such as water saving behaviors, are not studied. The study was conducted to explain water saving beha¬viors in Yazd, Iran on the basis of Health Belief Model and Reasoned Action Theory. The cross-sectional study used random cluster sampling to recruit 200 heads of households to collect the data. The survey questionnaire was tested for its content validity and reliability. Analysis of data included descriptive statistics, simple correlation, hierarchical multiple regression. Simple correlations between water saving behaviors and Reasoned Action Theory and Health Belief Model constructs were statistically significant. Health Belief Model and Reasoned Action Theory constructs explained 20.80% and 8.40% of the variances in water saving beha-viors, respectively. Perceived barriers were the strongest Predictor. Additionally, there was a sta¬tistically positive correlation between water saving behaviors and intention. In designing interventions aimed at water waste prevention, barriers of water saving behaviors should be addressed first, followed by people's attitude towards water saving. Health Belief Model constructs, with the exception of perceived severity and benefits, is more powerful than is Reasoned Action Theory in predicting water saving behavior and may be used as a framework for educational interventions aimed at improving water saving behaviors.

  18. Health Belief Model and Reasoned Action Theory in Predicting Water Saving Behaviors in Yazd, Iran

    PubMed Central

    Morowatisharifabad, Mohammad Ali; Momayyezi, Mahdieh; Ghaneian, Mohammad Taghi

    2012-01-01

    Background: People's behaviors and intentions about healthy behaviors depend on their beliefs, values, and knowledge about the issue. Various models of health education are used in deter¬mining predictors of different healthy behaviors but their efficacy in cultural behaviors, such as water saving behaviors, are not studied. The study was conducted to explain water saving beha¬viors in Yazd, Iran on the basis of Health Belief Model and Reasoned Action Theory. Methods: The cross-sectional study used random cluster sampling to recruit 200 heads of households to collect the data. The survey questionnaire was tested for its content validity and reliability. Analysis of data included descriptive statistics, simple correlation, hierarchical multiple regression. Results: Simple correlations between water saving behaviors and Reasoned Action Theory and Health Belief Model constructs were statistically significant. Health Belief Model and Reasoned Action Theory constructs explained 20.80% and 8.40% of the variances in water saving beha-viors, respectively. Perceived barriers were the strongest Predictor. Additionally, there was a sta¬tistically positive correlation between water saving behaviors and intention. Conclusion: In designing interventions aimed at water waste prevention, barriers of water saving behaviors should be addressed first, followed by people's attitude towards water saving. Health Belief Model constructs, with the exception of perceived severity and benefits, is more powerful than is Reasoned Action Theory in predicting water saving behavior and may be used as a framework for educational interventions aimed at improving water saving behaviors. PMID:24688927

  19. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  20. Molecular vibrational energy flow

    NASA Astrophysics Data System (ADS)

    Gruebele, M.; Bigwood, R.

    This article reviews some recent work in molecular vibrational energy flow (IVR), with emphasis on our own computational and experimental studies. We consider the problem in various representations, and use these to develop a family of simple models which combine specific molecular properties (e.g. size, vibrational frequencies) with statistical properties of the potential energy surface and wavefunctions. This marriage of molecular detail and statistical simplification captures trends of IVR mechanisms and survival probabilities beyond the abilities of purely statistical models or the computational limitations of full ab initio approaches. Of particular interest is IVR in the intermediate time regime, where heavy-atom skeletal modes take over the IVR process from hydrogenic motions even upon X H bond excitation. Experiments and calculations on prototype heavy-atom systems show that intermediate time IVR differs in many aspects from the early stages of hydrogenic mode IVR. As a result, IVR can be coherently frozen, with potential applications to selective chemistry.

  1. A computational visual saliency model based on statistics and machine learning.

    PubMed

    Lin, Ru-Je; Lin, Wei-Song

    2014-08-01

    Identifying the type of stimuli that attracts human visual attention has been an appealing topic for scientists for many years. In particular, marking the salient regions in images is useful for both psychologists and many computer vision applications. In this paper, we propose a computational approach for producing saliency maps using statistics and machine learning methods. Based on four assumptions, three properties (Feature-Prior, Position-Prior, and Feature-Distribution) can be derived and combined by a simple intersection operation to obtain a saliency map. These properties are implemented by a similarity computation, support vector regression (SVR) technique, statistical analysis of training samples, and information theory using low-level features. This technique is able to learn the preferences of human visual behavior while simultaneously considering feature uniqueness. Experimental results show that our approach performs better in predicting human visual attention regions than 12 other models in two test databases. © 2014 ARVO.

  2. The use of analysis of variance procedures in biological studies

    USGS Publications Warehouse

    Williams, B.K.

    1987-01-01

    The analysis of variance (ANOVA) is widely used in biological studies, yet there remains considerable confusion among researchers about the interpretation of hypotheses being tested. Ambiguities arise when statistical designs are unbalanced, and in particular when not all combinations of design factors are represented in the data. This paper clarifies the relationship among hypothesis testing, statistical modelling and computing procedures in ANOVA for unbalanced data. A simple two-factor fixed effects design is used to illustrate three common parametrizations for ANOVA models, and some associations among these parametrizations are developed. Biologically meaningful hypotheses for main effects and interactions are given in terms of each parametrization, and procedures for testing the hypotheses are described. The standard statistical computing procedures in ANOVA are given along with their corresponding hypotheses. Throughout the development unbalanced designs are assumed and attention is given to problems that arise with missing cells.

  3. Why the Long Face? The Mechanics of Mandibular Symphysis Proportions in Crocodiles

    PubMed Central

    Walmsley, Christopher W.; Smits, Peter D.; Quayle, Michelle R.; McCurry, Matthew R.; Richards, Heather S.; Oldfield, Christopher C.; Wroe, Stephen; Clausen, Phillip D.; McHenry, Colin R.

    2013-01-01

    Background Crocodilians exhibit a spectrum of rostral shape from long snouted (longirostrine), through to short snouted (brevirostrine) morphologies. The proportional length of the mandibular symphysis correlates consistently with rostral shape, forming as much as 50% of the mandible’s length in longirostrine forms, but 10% in brevirostrine crocodilians. Here we analyse the structural consequences of an elongate mandibular symphysis in relation to feeding behaviours. Methods/Principal Findings Simple beam and high resolution Finite Element (FE) models of seven species of crocodile were analysed under loads simulating biting, shaking and twisting. Using beam theory, we statistically compared multiple hypotheses of which morphological variables should control the biomechanical response. Brevi- and mesorostrine morphologies were found to consistently outperform longirostrine types when subject to equivalent biting, shaking and twisting loads. The best predictors of performance for biting and twisting loads in FE models were overall length and symphyseal length respectively; for shaking loads symphyseal length and a multivariate measurement of shape (PC1– which is strongly but not exclusively correlated with symphyseal length) were equally good predictors. Linear measurements were better predictors than multivariate measurements of shape in biting and twisting loads. For both biting and shaking loads but not for twisting, simple beam models agree with best performance predictors in FE models. Conclusions/Significance Combining beam and FE modelling allows a priori hypotheses about the importance of morphological traits on biomechanics to be statistically tested. Short mandibular symphyses perform well under loads used for feeding upon large prey, but elongate symphyses incur high strains under equivalent loads, underlining the structural constraints to prey size in the longirostrine morphotype. The biomechanics of the crocodilian mandible are largely consistent with beam theory and can be predicted from simple morphological measurements, suggesting that crocodilians are a useful model for investigating the palaeobiomechanics of other aquatic tetrapods. PMID:23342027

  4. Statistical Approaches for Spatiotemporal Prediction of Low Flows

    NASA Astrophysics Data System (ADS)

    Fangmann, A.; Haberlandt, U.

    2017-12-01

    An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be problematic. Spatiotemporal prediction of L-moments appeared highly uncertain for higher-order moments resulting in unrealistic future low flow values. All in all, the results promote an inclusion of simple statistical methods in climate change impact assessment.

  5. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed

    Kong, A; Cox, N J

    1997-11-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.

  6. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed Central

    Kong, A; Cox, N J

    1997-01-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested. PMID:9345087

  7. A statistical method to estimate low-energy hadronic cross sections

    NASA Astrophysics Data System (ADS)

    Balassa, Gábor; Kovács, Péter; Wolf, György

    2018-02-01

    In this article we propose a model based on the Statistical Bootstrap approach to estimate the cross sections of different hadronic reactions up to a few GeV in c.m.s. energy. The method is based on the idea, when two particles collide a so-called fireball is formed, which after a short time period decays statistically into a specific final state. To calculate the probabilities we use a phase space description extended with quark combinatorial factors and the possibility of more than one fireball formation. In a few simple cases the probability of a specific final state can be calculated analytically, where we show that the model is able to reproduce the ratios of the considered cross sections. We also show that the model is able to describe proton-antiproton annihilation at rest. In the latter case we used a numerical method to calculate the more complicated final state probabilities. Additionally, we examined the formation of strange and charmed mesons as well, where we used existing data to fit the relevant model parameters.

  8. Eutrophication risk assessment in coastal embayments using simple statistical models.

    PubMed

    Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G

    2003-09-01

    A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.

  9. The epistemological status of general circulation models

    NASA Astrophysics Data System (ADS)

    Loehle, Craig

    2018-03-01

    Forecasts of both likely anthropogenic effects on climate and consequent effects on nature and society are based on large, complex software tools called general circulation models (GCMs). Forecasts generated by GCMs have been used extensively in policy decisions related to climate change. However, the relation between underlying physical theories and results produced by GCMs is unclear. In the case of GCMs, many discretizations and approximations are made, and simulating Earth system processes is far from simple and currently leads to some results with unknown energy balance implications. Statistical testing of GCM forecasts for degree of agreement with data would facilitate assessment of fitness for use. If model results need to be put on an anomaly basis due to model bias, then both visual and quantitative measures of model fit depend strongly on the reference period used for normalization, making testing problematic. Epistemology is here applied to problems of statistical inference during testing, the relationship between the underlying physics and the models, the epistemic meaning of ensemble statistics, problems of spatial and temporal scale, the existence or not of an unforced null for climate fluctuations, the meaning of existing uncertainty estimates, and other issues. Rigorous reasoning entails carefully quantifying levels of uncertainty.

  10. Sandpile-based model for capturing magnitude distributions and spatiotemporal clustering and separation in regional earthquakes

    NASA Astrophysics Data System (ADS)

    Batac, Rene C.; Paguirigan, Antonino A., Jr.; Tarun, Anjali B.; Longjas, Anthony G.

    2017-04-01

    We propose a cellular automata model for earthquake occurrences patterned after the sandpile model of self-organized criticality (SOC). By incorporating a single parameter describing the probability to target the most susceptible site, the model successfully reproduces the statistical signatures of seismicity. The energy distributions closely follow power-law probability density functions (PDFs) with a scaling exponent of around -1. 6, consistent with the expectations of the Gutenberg-Richter (GR) law, for a wide range of the targeted triggering probability values. Additionally, for targeted triggering probabilities within the range 0.004-0.007, we observe spatiotemporal distributions that show bimodal behavior, which is not observed previously for the original sandpile. For this critical range of values for the probability, model statistics show remarkable comparison with long-period empirical data from earthquakes from different seismogenic regions. The proposed model has key advantages, the foremost of which is the fact that it simultaneously captures the energy, space, and time statistics of earthquakes by just introducing a single parameter, while introducing minimal parameters in the simple rules of the sandpile. We believe that the critical targeting probability parameterizes the memory that is inherently present in earthquake-generating regions.

  11. Quantum Optics Models of EIT Noise and Power Broadening

    NASA Astrophysics Data System (ADS)

    Snider, Chad; Crescimanno, Michael; O'Leary, Shannon

    2011-04-01

    When two coherent beams of light interact with an atom they tend to drive the atom to a non-absorbing state through a process called Electromagnetically Induced Transparency (EIT). If the light's frequency dithers, the atom's state stochastically moves in and out of this non-absorbing state. We describe a simple quantum optics model of this process that captures the essential experimentally observed statistical features of this EIT noise, with a particular emphasis on understanding power broadening.

  12. Anthropogenic heat flux: advisable spatial resolutions when input data are scarce

    NASA Astrophysics Data System (ADS)

    Gabey, A. M.; Grimmond, C. S. B.; Capel-Timms, I.

    2018-02-01

    Anthropogenic heat flux (QF) may be significant in cities, especially under low solar irradiance and at night. It is of interest to many practitioners including meteorologists, city planners and climatologists. QF estimates at fine temporal and spatial resolution can be derived from models that use varying amounts of empirical data. This study compares simple and detailed models in a European megacity (London) at 500 m spatial resolution. The simple model (LQF) uses spatially resolved population data and national energy statistics. The detailed model (GQF) additionally uses local energy, road network and workday population data. The Fractions Skill Score (FSS) and bias are used to rate the skill with which the simple model reproduces the spatial patterns and magnitudes of QF, and its sub-components, from the detailed model. LQF skill was consistently good across 90% of the city, away from the centre and major roads. The remaining 10% contained elevated emissions and "hot spots" representing 30-40% of the total city-wide energy. This structure was lost because it requires workday population, spatially resolved building energy consumption and/or road network data. Daily total building and traffic energy consumption estimates from national data were within ± 40% of local values. Progressively coarser spatial resolutions to 5 km improved skill for total QF, but important features (hot spots, transport network) were lost at all resolutions when residential population controlled spatial variations. The results demonstrate that simple QF models should be applied with conservative spatial resolution in cities that, like London, exhibit time-varying energy use patterns.

  13. Psychophysics of time perception and intertemporal choice models

    NASA Astrophysics Data System (ADS)

    Takahashi, Taiki; Oono, Hidemi; Radford, Mark H. B.

    2008-03-01

    Intertemporal choice and psychophysics of time perception have been attracting attention in econophysics and neuroeconomics. Several models have been proposed for intertemporal choice: exponential discounting, general hyperbolic discounting (exponential discounting with logarithmic time perception of the Weber-Fechner law, a q-exponential discount model based on Tsallis's statistics), simple hyperbolic discounting, and Stevens' power law-exponential discounting (exponential discounting with Stevens' power time perception). In order to examine the fitness of the models for behavioral data, we estimated the parameters and AICc (Akaike Information Criterion with small sample correction) of the intertemporal choice models by assessing the points of subjective equality (indifference points) at seven delays. Our results have shown that the orders of the goodness-of-fit for both group and individual data were [Weber-Fechner discounting (general hyperbola) > Stevens' power law discounting > Simple hyperbolic discounting > Exponential discounting], indicating that human time perception in intertemporal choice may follow the Weber-Fechner law. Indications of the results for neuropsychopharmacological treatments of addiction and biophysical processing underlying temporal discounting and time perception are discussed.

  14. Acute Diarrheal Syndromic Surveillance

    PubMed Central

    Kam, H.J.; Choi, S.; Cho, J.P.; Min, Y.G.; Park, R.W.

    2010-01-01

    Objective In an effort to identify and characterize the environmental factors that affect the number of patients with acute diarrheal (AD) syndrome, we developed and tested two regional surveillance models including holiday and weather information in addition to visitor records, at emergency medical facilities in the Seoul metropolitan area of Korea. Methods With 1,328,686 emergency department visitor records from the National Emergency Department Information system (NEDIS) and the holiday and weather information, two seasonal ARIMA models were constructed: (1) The simple model (only with total patient number), (2) the environmental factor-added model. The stationary R-squared was utilized as an in-sample model goodness-of-fit statistic for the constructed models, and the cumulative mean of the Mean Absolute Percentage Error (MAPE) was used to measure post-sample forecast accuracy over the next 1 month. Results The (1,0,1)(0,1,1)7 ARIMA model resulted in an adequate model fit for the daily number of AD patient visits over 12 months for both cases. Among various features, the total number of patient visits was selected as a commonly influential independent variable. Additionally, for the environmental factor-added model, holidays and daily precipitation were selected as features that statistically significantly affected model fitting. Stationary R-squared values were changed in a range of 0.651-0.828 (simple), and 0.805-0.844 (environmental factor-added) with p<0.05. In terms of prediction, the MAPE values changed within 0.090-0.120 and 0.089-0.114, respectively. Conclusion The environmental factor-added model yielded better MAPE values. Holiday and weather information appear to be crucial for the construction of an accurate syndromic surveillance model for AD, in addition to the visitor and assessment records. PMID:23616829

  15. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

    PubMed

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

    2012-03-15

    To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Symmetrical treatment of "Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition", for major depressive disorders.

    PubMed

    Sawamura, Jitsuki; Morishita, Shigeru; Ishigooka, Jun

    2016-01-01

    We previously presented a group theoretical model that describes psychiatric patient states or clinical data in a graded vector-like format based on modulo groups. Meanwhile, the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5, the current version), is frequently used for diagnosis in daily psychiatric treatments and biological research. The diagnostic criteria of DSM-5 contain simple binominal items relating to the presence or absence of specific symptoms. In spite of its simple form, the practical structure of the DSM-5 system is not sufficiently systemized for data to be treated in a more rationally sophisticated way. To view the disease states in terms of symmetry in the manner of abstract algebra is considered important for the future systematization of clinical medicine. We provide a simple idea for the practical treatment of the psychiatric diagnosis/score of DSM-5 using depressive symptoms in line with our previously proposed method. An expression is given employing modulo-2 and -7 arithmetic (in particular, additive group theory) for Criterion A of a 'major depressive episode' that must be met for the diagnosis of 'major depressive disorder' in DSM-5. For this purpose, the novel concept of an imaginary value 0 that can be recognized as an explicit 0 or implicit 0 was introduced to compose the model. The zeros allow the incorporation or deletion of an item between any other symptoms if they are ordered appropriately. Optionally, a vector-like expression can be used to rate/select only specific items when modifying the criterion/scale. Simple examples are illustrated concretely. Further development of the proposed method for the criteria/scale of a disease is expected to raise the level of formalism of clinical medicine to that of other fields of natural science.

  17. Statistical complexity without explicit reference to underlying probabilities

    NASA Astrophysics Data System (ADS)

    Pennini, F.; Plastino, A.

    2018-06-01

    We show that extremely simple systems of a not too large number of particles can be simultaneously thermally stable and complex. To such an end, we extend the statistical complexity's notion to simple configurations of non-interacting particles, without appeal to probabilities, and discuss configurational properties.

  18. Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model.

    PubMed

    Hu, Jianhua; Wright, Fred A

    2007-03-01

    The identification of the genes that are differentially expressed in two-sample microarray experiments remains a difficult problem when the number of arrays is very small. We discuss the implications of using ordinary t-statistics and examine other commonly used variants. For oligonucleotide arrays with multiple probes per gene, we introduce a simple model relating the mean and variance of expression, possibly with gene-specific random effects. Parameter estimates from the model have natural shrinkage properties that guard against inappropriately small variance estimates, and the model is used to obtain a differential expression statistic. A limiting value to the positive false discovery rate (pFDR) for ordinary t-tests provides motivation for our use of the data structure to improve variance estimates. Our approach performs well compared to other proposed approaches in terms of the false discovery rate.

  19. Dynamics of non-Markovian exclusion processes

    NASA Astrophysics Data System (ADS)

    Khoromskaia, Diana; Harris, Rosemary J.; Grosskinsky, Stefan

    2014-12-01

    Driven diffusive systems are often used as simple discrete models of collective transport phenomena in physics, biology or social sciences. Restricting attention to one-dimensional geometries, the asymmetric simple exclusion process (ASEP) plays a paradigmatic role to describe noise-activated driven motion of entities subject to an excluded volume interaction and many variants have been studied in recent years. While in the standard ASEP the noise is Poissonian and the process is therefore Markovian, in many applications the statistics of the activating noise has a non-standard distribution with possible memory effects resulting from internal degrees of freedom or external sources. This leads to temporal correlations and can significantly affect the shape of the current-density relation as has been studied recently for a number of scenarios. In this paper we report a general framework to derive the fundamental diagram of ASEPs driven by non-Poissonian noise by using effectively only two simple quantities, viz., the mean residual lifetime of the jump distribution and a suitably defined temporal correlation length. We corroborate our results by detailed numerical studies for various noise statistics under periodic boundary conditions and discuss how our approach can be applied to more general driven diffusive systems.

  20. Spectral likelihood expansions for Bayesian inference

    NASA Astrophysics Data System (ADS)

    Nagel, Joseph B.; Sudret, Bruno

    2016-03-01

    A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.

  1. Stationarity: Wanted dead or alive?

    USGS Publications Warehouse

    Lins, Larry F.; Cohn, Timothy A.

    2011-01-01

    Aligning engineering practice with natural process behavior would appear, on its face, to be a prudent and reasonable course of action. However, if we do not understand the long-term characteristics of hydroclimatic processes, how does one find the prudent and reasonable course needed for water management? We consider this question in light of three aspects of existing and unresolved issues affecting hydroclimatic variability and statistical inference: Hurst-Kolmogorov phenomena; the complications long-term persistence introduces with respect to statistical understanding; and the dependence of process understanding on arbitrary sampling choices. These problems are not easily addressed. In such circumstances, humility may be more important than physics; a simple model with well-understood flaws may be preferable to a sophisticated model whose correspondence to reality is uncertain.

  2. Self-organization, the cascade model, and natural hazards.

    PubMed

    Turcotte, Donald L; Malamud, Bruce D; Guzzetti, Fausto; Reichenbach, Paola

    2002-02-19

    We consider the frequency-size statistics of two natural hazards, forest fires and landslides. Both appear to satisfy power-law (fractal) distributions to a good approximation under a wide variety of conditions. Two simple cellular-automata models have been proposed as analogs for this observed behavior, the forest fire model for forest fires and the sand pile model for landslides. The behavior of these models can be understood in terms of a self-similar inverse cascade. For the forest fire model the cascade consists of the coalescence of clusters of trees; for the sand pile model the cascade consists of the coalescence of metastable regions.

  3. Self-organization, the cascade model, and natural hazards

    PubMed Central

    Turcotte, Donald L.; Malamud, Bruce D.; Guzzetti, Fausto; Reichenbach, Paola

    2002-01-01

    We consider the frequency-size statistics of two natural hazards, forest fires and landslides. Both appear to satisfy power-law (fractal) distributions to a good approximation under a wide variety of conditions. Two simple cellular-automata models have been proposed as analogs for this observed behavior, the forest fire model for forest fires and the sand pile model for landslides. The behavior of these models can be understood in terms of a self-similar inverse cascade. For the forest fire model the cascade consists of the coalescence of clusters of trees; for the sand pile model the cascade consists of the coalescence of metastable regions. PMID:11875206

  4. Energy-balance climate models

    NASA Technical Reports Server (NTRS)

    North, G. R.; Cahalan, R. F.; Coakley, J. A., Jr.

    1980-01-01

    An introductory survey of the global energy balance climate models is presented with an emphasis on analytical results. A sequence of increasingly complicated models involving ice cap and radiative feedback processes are solved and the solutions and parameter sensitivities are studied. The model parameterizations are examined critically in light of many current uncertainties. A simple seasonal model is used to study the effects of changes in orbital elements on the temperature field. A linear stability theorem and a complete nonlinear stability analysis for the models are developed. Analytical solutions are also obtained for the linearized models driven by stochastic forcing elements. In this context the relation between natural fluctuation statistics and climate sensitivity is stressed.

  5. Energy balance climate models

    NASA Technical Reports Server (NTRS)

    North, G. R.; Cahalan, R. F.; Coakley, J. A., Jr.

    1981-01-01

    An introductory survey of the global energy balance climate models is presented with an emphasis on analytical results. A sequence of increasingly complicated models involving ice cap and radiative feedback processes are solved, and the solutions and parameter sensitivities are studied. The model parameterizations are examined critically in light of many current uncertainties. A simple seasonal model is used to study the effects of changes in orbital elements on the temperature field. A linear stability theorem and a complete nonlinear stability analysis for the models are developed. Analytical solutions are also obtained for the linearized models driven by stochastic forcing elements. In this context the relation between natural fluctuation statistics and climate sensitivity is stressed.

  6. CDP++.Italian: Modelling Sublexical and Supralexical Inconsistency in a Shallow Orthography

    PubMed Central

    Perry, Conrad; Ziegler, Johannes C.; Zorzi, Marco

    2014-01-01

    Most models of reading aloud have been constructed to explain data in relatively complex orthographies like English and French. Here, we created an Italian version of the Connectionist Dual Process Model of Reading Aloud (CDP++) to examine the extent to which the model could predict data in a language which has relatively simple orthography-phonology relationships but is relatively complex at a suprasegmental (word stress) level. We show that the model exhibits good quantitative performance and accounts for key phenomena observed in naming studies, including some apparently contradictory findings. These effects include stress regularity and stress consistency, both of which have been especially important in studies of word recognition and reading aloud in Italian. Overall, the results of the model compare favourably to an alternative connectionist model that can learn non-linear spelling-to-sound mappings. This suggests that CDP++ is currently the leading computational model of reading aloud in Italian, and that its simple linear learning mechanism adequately captures the statistical regularities of the spelling-to-sound mapping both at the segmental and supra-segmental levels. PMID:24740261

  7. Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages.

    PubMed

    Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry

    2013-08-01

    Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.

  8. Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages

    PubMed Central

    Kim, Yoonsang; Emery, Sherry

    2013-01-01

    Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415

  9. Statistical fluctuations of an ocean surface inferred from shoes and ships

    NASA Astrophysics Data System (ADS)

    Lerche, Ian; Maubeuge, Frédéric

    1995-12-01

    This paper shows that it is possible to roughly estimate some ocean properties using simple time-dependent statistical models of ocean fluctuations. Based on a real incident, the loss by a vessel of a Nike shoes container in the North Pacific Ocean, a statistical model was tested on data sets consisting of the Nike shoes found by beachcombers a few months later. This statistical treatment of the shoes' motion allows one to infer velocity trends of the Pacific Ocean, together with their fluctuation strengths. The idea is to suppose that there is a mean bulk flow speed that can depend on location on the ocean surface and time. The fluctuations of the surface flow speed are then treated as statistically random. The distribution of shoes is described in space and time using Markov probability processes related to the mean and fluctuating ocean properties. The aim of the exercise is to provide some of the properties of the Pacific Ocean that are otherwise calculated using a sophisticated numerical model, OSCURS, where numerous data are needed. Relevant quantities are sharply estimated, which can be useful to (1) constrain output results from OSCURS computations, and (2) elucidate the behavior patterns of ocean flow characteristics on long time scales.

  10. Improving cerebellar segmentation with statistical fusion

    NASA Astrophysics Data System (ADS)

    Plassard, Andrew J.; Yang, Zhen; Prince, Jerry L.; Claassen, Daniel O.; Landman, Bennett A.

    2016-03-01

    The cerebellum is a somatotopically organized central component of the central nervous system well known to be involved with motor coordination and increasingly recognized roles in cognition and planning. Recent work in multiatlas labeling has created methods that offer the potential for fully automated 3-D parcellation of the cerebellar lobules and vermis (which are organizationally equivalent to cortical gray matter areas). This work explores the trade offs of using different statistical fusion techniques and post hoc optimizations in two datasets with distinct imaging protocols. We offer a novel fusion technique by extending the ideas of the Selective and Iterative Method for Performance Level Estimation (SIMPLE) to a patch-based performance model. We demonstrate the effectiveness of our algorithm, Non- Local SIMPLE, for segmentation of a mixed population of healthy subjects and patients with severe cerebellar anatomy. Under the first imaging protocol, we show that Non-Local SIMPLE outperforms previous gold-standard segmentation techniques. In the second imaging protocol, we show that Non-Local SIMPLE outperforms previous gold standard techniques but is outperformed by a non-locally weighted vote with the deeper population of atlases available. This work advances the state of the art in open source cerebellar segmentation algorithms and offers the opportunity for routinely including cerebellar segmentation in magnetic resonance imaging studies that acquire whole brain T1-weighted volumes with approximately 1 mm isotropic resolution.

  11. Development of a funding, cost, and spending model for satellite projects

    NASA Technical Reports Server (NTRS)

    Johnson, Jesse P.

    1989-01-01

    The need for a predictive budget/funging model is obvious. The current models used by the Resource Analysis Office (RAO) are used to predict the total costs of satellite projects. An effort to extend the modeling capabilities from total budget analysis to total budget and budget outlays over time analysis was conducted. A statistical based and data driven methodology was used to derive and develop the model. Th budget data for the last 18 GSFC-sponsored satellite projects were analyzed and used to build a funding model which would describe the historical spending patterns. This raw data consisted of dollars spent in that specific year and their 1989 dollar equivalent. This data was converted to the standard format used by the RAO group and placed in a database. A simple statistical analysis was performed to calculate the gross statistics associated with project length and project cost ant the conditional statistics on project length and project cost. The modeling approach used is derived form the theory of embedded statistics which states that properly analyzed data will produce the underlying generating function. The process of funding large scale projects over extended periods of time is described by Life Cycle Cost Models (LCCM). The data was analyzed to find a model in the generic form of a LCCM. The model developed is based on a Weibull function whose parameters are found by both nonlinear optimization and nonlinear regression. In order to use this model it is necessary to transform the problem from a dollar/time space to a percentage of total budget/time space. This transformation is equivalent to moving to a probability space. By using the basic rules of probability, the validity of both the optimization and the regression steps are insured. This statistically significant model is then integrated and inverted. The resulting output represents a project schedule which relates the amount of money spent to the percentage of project completion.

  12. Beyond δ: Tailoring marked statistics to reveal modified gravity

    NASA Astrophysics Data System (ADS)

    Valogiannis, Georgios; Bean, Rachel

    2018-01-01

    Models which attempt to explain the accelerated expansion of the universe through large-scale modifications to General Relativity (GR), must satisfy the stringent experimental constraints of GR in the solar system. Viable candidates invoke a “screening” mechanism, that dynamically suppresses deviations in high density environments, making their overall detection challenging even for ambitious future large-scale structure surveys. We present methods to efficiently simulate the non-linear properties of such theories, and consider how a series of statistics that reweight the density field to accentuate deviations from GR can be applied to enhance the overall signal-to-noise ratio in differentiating the models from GR. Our results demonstrate that the cosmic density field can yield additional, invaluable cosmological information, beyond the simple density power spectrum, that will enable surveys to more confidently discriminate between modified gravity models and ΛCDM.

  13. Simplified estimation of age-specific reference intervals for skewed data.

    PubMed

    Wright, E M; Royston, P

    1997-12-30

    Age-specific reference intervals are commonly used in medical screening and clinical practice, where interest lies in the detection of extreme values. Many different statistical approaches have been published on this topic. The advantages of a parametric method are that they necessarily produce smooth centile curves, the entire density is estimated and an explicit formula is available for the centiles. The method proposed here is a simplified version of a recent approach proposed by Royston and Wright. Basic transformations of the data and multiple regression techniques are combined to model the mean, standard deviation and skewness. Using these simple tools, which are implemented in almost all statistical computer packages, age-specific reference intervals may be obtained. The scope of the method is illustrated by fitting models to several real data sets and assessing each model using goodness-of-fit techniques.

  14. Statistical models for causation: what inferential leverage do they provide?

    PubMed

    Freedman, David A

    2006-12-01

    Experiments offer more reliable evidence on causation than observational studies, which is not to gainsay the contribution to knowledge from observation. Experiments should be analyzed as experiments, not as observational studies. A simple comparison of rates might be just the right tool, with little value added by "sophisticated" models. This article discusses current models for causation, as applied to experimental and observational data. The intention-to-treat principle and the effect of treatment on the treated will also be discussed. Flaws in per-protocol and treatment-received estimates will be demonstrated.

  15. Model validation of simple-graph representations of metabolism

    PubMed Central

    Holme, Petter

    2009-01-01

    The large-scale properties of chemical reaction systems, such as metabolism, can be studied with graph-based methods. To do this, one needs to reduce the information, lists of chemical reactions, available in databases. Even for the simplest type of graph representation, this reduction can be done in several ways. We investigate different simple network representations by testing how well they encode information about one biologically important network structure—network modularity (the propensity for edges to be clustered into dense groups that are sparsely connected between each other). To achieve this goal, we design a model of reaction systems where network modularity can be controlled and measure how well the reduction to simple graphs captures the modular structure of the model reaction system. We find that the network types that best capture the modular structure of the reaction system are substrate–product networks (where substrates are linked to products of a reaction) and substance networks (with edges between all substances participating in a reaction). Furthermore, we argue that the proposed model for reaction systems with tunable clustering is a general framework for studies of how reaction systems are affected by modularity. To this end, we investigate statistical properties of the model and find, among other things, that it recreates correlations between degree and mass of the molecules. PMID:19158012

  16. Regression Models for Identifying Noise Sources in Magnetic Resonance Images

    PubMed Central

    Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.

    2009-01-01

    Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478

  17. Complex dynamics and empirical evidence (Invited Paper)

    NASA Astrophysics Data System (ADS)

    Delli Gatti, Domenico; Gaffeo, Edoardo; Giulioni, Gianfranco; Gallegati, Mauro; Kirman, Alan; Palestrini, Antonio; Russo, Alberto

    2005-05-01

    Standard macroeconomics, based on a reductionist approach centered on the representative agent, is badly equipped to explain the empirical evidence where heterogeneity and industrial dynamics are the rule. In this paper we show that a simple agent-based model of heterogeneous financially fragile agents is able to replicate a large number of scaling type stylized facts with a remarkable degree of statistical precision.

  18. Information-Decay Pursuit of Dynamic Parameters in Student Models

    DTIC Science & Technology

    1994-04-01

    simple worked-through example). Commercially available computer programs for structuring and using Bayesian inference include ERGO ( Noetic Systems...Tukey, J.W. (1977). Data analysis and Regression: A second course in statistics. Reading, MA: Addison-Wesley. Noetic Systems, Inc. (1991). ERGO...Naval Academy Division of Educational Studies Annapolis MD 21402-5002 Elmory Univerity Dr Janice Gifford 210 Fiabburne Bldg University of

  19. Estimating maize production in Kenya using NDVI: Some statistical considerations

    USGS Publications Warehouse

    Lewis, J.E.; Rowland, James; Nadeau , A.

    1998-01-01

    A regression model approach using a normalized difference vegetation index (NDVI) has the potential for estimating crop production in East Africa. However, before production estimation can become a reality, the underlying model assumptions and statistical nature of the sample data (NDVI and crop production) must be examined rigorously. Annual maize production statistics from 1982-90 for 36 agricultural districts within Kenya were used as the dependent variable; median area NDVI (independent variable) values from each agricultural district and year were extracted from the annual maximum NDVI data set. The input data and the statistical association of NDVI with maize production for Kenya were tested systematically for the following items: (1) homogeneity of the data when pooling the sample, (2) gross data errors and influence points, (3) serial (time) correlation, (4) spatial autocorrelation and (5) stability of the regression coefficients. The results of using a simple regression model with NDVI as the only independent variable are encouraging (r 0.75, p 0.05) and illustrate that NDVI can be a responsive indicator of maize production, especially in areas of high NDVI spatial variability, which coincide with areas of production variability in Kenya.

  20. Asymptotic formulae for likelihood-based tests of new physics

    NASA Astrophysics Data System (ADS)

    Cowan, Glen; Cranmer, Kyle; Gross, Eilam; Vitells, Ofer

    2011-02-01

    We describe likelihood-based statistical tests for use in high energy physics for the discovery of new phenomena and for construction of confidence intervals on model parameters. We focus on the properties of the test procedures that allow one to account for systematic uncertainties. Explicit formulae for the asymptotic distributions of test statistics are derived using results of Wilks and Wald. We motivate and justify the use of a representative data set, called the "Asimov data set", which provides a simple method to obtain the median experimental sensitivity of a search or measurement as well as fluctuations about this expectation.

  1. Elucidating the effects of adsorbent flexibility on fluid adsorption using simple models and flat-histogram sampling methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, Vincent K., E-mail: vincent.shen@nist.gov; Siderius, Daniel W.

    2014-06-28

    Using flat-histogram Monte Carlo methods, we investigate the adsorptive behavior of the square-well fluid in two simple slit-pore-like models intended to capture fundamental characteristics of flexible adsorbent materials. Both models require as input thermodynamic information about the flexible adsorbent material itself. An important component of this work involves formulating the flexible pore models in the appropriate thermodynamic (statistical mechanical) ensembles, namely, the osmotic ensemble and a variant of the grand-canonical ensemble. Two-dimensional probability distributions, which are calculated using flat-histogram methods, provide the information necessary to determine adsorption thermodynamics. For example, we are able to determine precisely adsorption isotherms, (equilibrium) phasemore » transition conditions, limits of stability, and free energies for a number of different flexible adsorbent materials, distinguishable as different inputs into the models. While the models used in this work are relatively simple from a geometric perspective, they yield non-trivial adsorptive behavior, including adsorption-desorption hysteresis solely due to material flexibility and so-called “breathing” of the adsorbent. The observed effects can in turn be tied to the inherent properties of the bare adsorbent. Some of the effects are expected on physical grounds while others arise from a subtle balance of thermodynamic and mechanical driving forces. In addition, the computational strategy presented here can be easily applied to more complex models for flexible adsorbents.« less

  2. Elucidating the effects of adsorbent flexibility on fluid adsorption using simple models and flat-histogram sampling methods

    NASA Astrophysics Data System (ADS)

    Shen, Vincent K.; Siderius, Daniel W.

    2014-06-01

    Using flat-histogram Monte Carlo methods, we investigate the adsorptive behavior of the square-well fluid in two simple slit-pore-like models intended to capture fundamental characteristics of flexible adsorbent materials. Both models require as input thermodynamic information about the flexible adsorbent material itself. An important component of this work involves formulating the flexible pore models in the appropriate thermodynamic (statistical mechanical) ensembles, namely, the osmotic ensemble and a variant of the grand-canonical ensemble. Two-dimensional probability distributions, which are calculated using flat-histogram methods, provide the information necessary to determine adsorption thermodynamics. For example, we are able to determine precisely adsorption isotherms, (equilibrium) phase transition conditions, limits of stability, and free energies for a number of different flexible adsorbent materials, distinguishable as different inputs into the models. While the models used in this work are relatively simple from a geometric perspective, they yield non-trivial adsorptive behavior, including adsorption-desorption hysteresis solely due to material flexibility and so-called "breathing" of the adsorbent. The observed effects can in turn be tied to the inherent properties of the bare adsorbent. Some of the effects are expected on physical grounds while others arise from a subtle balance of thermodynamic and mechanical driving forces. In addition, the computational strategy presented here can be easily applied to more complex models for flexible adsorbents.

  3. Statistical Hypothesis Testing in Intraspecific Phylogeography: NCPA versus ABC

    PubMed Central

    Templeton, Alan R.

    2009-01-01

    Nested clade phylogeographic analysis (NCPA) and approximate Bayesian computation (ABC) have been used to test phylogeographic hypotheses. Multilocus NCPA tests null hypotheses, whereas ABC discriminates among a finite set of alternatives. The interpretive criteria of NCPA are explicit and allow complex models to be built from simple components. The interpretive criteria of ABC are ad hoc and require the specification of a complete phylogeographic model. The conclusions from ABC are often influenced by implicit assumptions arising from the many parameters needed to specify a complex model. These complex models confound many assumptions so that biological interpretations are difficult. Sampling error is accounted for in NCPA, but ABC ignores important sources of sampling error that creates pseudo-statistical power. NCPA generates the full sampling distribution of its statistics, but ABC only yields local probabilities, which in turn make it impossible to distinguish between a good fitting model, a non-informative model, and an over-determined model. Both NCPA and ABC use approximations, but convergences of the approximations used in NCPA are well defined whereas those in ABC are not. NCPA can analyze a large number of locations, but ABC cannot. Finally, the dimensionality of tested hypothesis is known in NCPA, but not for ABC. As a consequence, the “probabilities” generated by ABC are not true probabilities and are statistically non-interpretable. Accordingly, ABC should not be used for hypothesis testing, but simulation approaches are valuable when used in conjunction with NCPA or other methods that do not rely on highly parameterized models. PMID:19192182

  4. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE PAGES

    Ipsen, Andreas

    2017-02-03

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  5. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ipsen, Andreas

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  6. Inevitable end-of-21st-century trends toward earlier surface runoff timing in California's Sierra Nevada Mountains

    NASA Astrophysics Data System (ADS)

    Schwartz, M. A.; Hall, A. D.; Sun, F.; Walton, D.; Berg, N.

    2015-12-01

    Hybrid dynamical-statistical downscaling is used to produce surface runoff timing projections for California's Sierra Nevada, a high-elevation mountain range with significant seasonal snow cover. First, future climate change projections (RCP8.5 forcing scenario, 2081-2100 period) from five CMIP5 global climate models (GCMs) are dynamically downscaled. These projections reveal that future warming leads to a shift toward earlier snowmelt and surface runoff timing throughout the Sierra Nevada region. Relationships between warming and surface runoff timing from the dynamical simulations are used to build a simple statistical model that mimics the dynamical model's projected surface runoff timing changes given GCM input or other statistically-downscaled input. This statistical model can be used to produce surface runoff timing projections for other GCMs, periods, and forcing scenarios to quantify ensemble-mean changes, uncertainty due to intermodel variability and consequences stemming from choice of forcing scenario. For all CMIP5 GCMs and forcing scenarios, significant trends toward earlier surface runoff timing occur at elevations below 2500m. Thus, we conclude that trends toward earlier surface runoff timing by the end-of-the-21st century are inevitable. The changes to surface runoff timing diagnosed in this study have implications for many dimensions of climate change, including impacts on surface hydrology, water resources, and ecosystems.

  7. Evidence of complex contagion of information in social media: An experiment using Twitter bots.

    PubMed

    Mønsted, Bjarke; Sapieżyński, Piotr; Ferrara, Emilio; Lehmann, Sune

    2017-01-01

    It has recently become possible to study the dynamics of information diffusion in techno-social systems at scale, due to the emergence of online platforms, such as Twitter, with millions of users. One question that systematically recurs is whether information spreads according to simple or complex dynamics: does each exposure to a piece of information have an independent probability of a user adopting it (simple contagion), or does this probability depend instead on the number of sources of exposure, increasing above some threshold (complex contagion)? Most studies to date are observational and, therefore, unable to disentangle the effects of confounding factors such as social reinforcement, homophily, limited attention, or network community structure. Here we describe a novel controlled experiment that we performed on Twitter using 'social bots' deployed to carry out coordinated attempts at spreading information. We propose two Bayesian statistical models describing simple and complex contagion dynamics, and test the competing hypotheses. We provide experimental evidence that the complex contagion model describes the observed information diffusion behavior more accurately than simple contagion. Future applications of our results include more effective defenses against malicious propaganda campaigns on social media, improved marketing and advertisement strategies, and design of effective network intervention techniques.

  8. A new statistical approach to climate change detection and attribution

    NASA Astrophysics Data System (ADS)

    Ribes, Aurélien; Zwiers, Francis W.; Azaïs, Jean-Marc; Naveau, Philippe

    2017-01-01

    We propose here a new statistical approach to climate change detection and attribution that is based on additive decomposition and simple hypothesis testing. Most current statistical methods for detection and attribution rely on linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. Climate modelling uncertainty is difficult to take into account with regression based methods and is almost never treated explicitly. As an alternative to this approach, our statistical model is only based on the additivity assumption; the proposed method does not regress observations onto expected response patterns. We introduce estimation and testing procedures based on likelihood maximization, and show that climate modelling uncertainty can easily be accounted for. Some discussion is provided on how to practically estimate the climate modelling uncertainty based on an ensemble of opportunity. Our approach is based on the " models are statistically indistinguishable from the truth" paradigm, where the difference between any given model and the truth has the same distribution as the difference between any pair of models, but other choices might also be considered. The properties of this approach are illustrated and discussed based on synthetic data. Lastly, the method is applied to the linear trend in global mean temperature over the period 1951-2010. Consistent with the last IPCC assessment report, we find that most of the observed warming over this period (+0.65 K) is attributable to anthropogenic forcings (+0.67 ± 0.12 K, 90 % confidence range), with a very limited contribution from natural forcings (-0.01± 0.02 K).

  9. Modelling nematode movement using time-fractional dynamics.

    PubMed

    Hapca, Simona; Crawford, John W; MacMillan, Keith; Wilson, Mike J; Young, Iain M

    2007-09-07

    We use a correlated random walk model in two dimensions to simulate the movement of the slug parasitic nematode Phasmarhabditis hermaphrodita in homogeneous environments. The model incorporates the observed statistical distributions of turning angle and speed derived from time-lapse studies of individual nematode trails. We identify strong temporal correlations between the turning angles and speed that preclude the case of a simple random walk in which successive steps are independent. These correlated random walks are appropriately modelled using an anomalous diffusion model, more precisely using a fractional sub-diffusion model for which the associated stochastic process is characterised by strong memory effects in the probability density function.

  10. Synthetic Earthquake Statistics From Physical Fault Models for the Lower Rhine Embayment

    NASA Astrophysics Data System (ADS)

    Brietzke, G. B.; Hainzl, S.; Zöller, G.

    2012-04-01

    As of today, seismic risk and hazard estimates mostly use pure empirical, stochastic models of earthquake fault systems tuned specifically to the vulnerable areas of interest. Although such models allow for reasonable risk estimates they fail to provide a link between the observed seismicity and the underlying physical processes. Solving a state-of-the-art fully dynamic description set of all relevant physical processes related to earthquake fault systems is likely not useful since it comes with a large number of degrees of freedom, poor constraints on its model parameters and a huge computational effort. Here, quasi-static and quasi-dynamic physical fault simulators provide a compromise between physical completeness and computational affordability and aim at providing a link between basic physical concepts and statistics of seismicity. Within the framework of quasi-static and quasi-dynamic earthquake simulators we investigate a model of the Lower Rhine Embayment (LRE) that is based upon seismological and geological data. We present and discuss statistics of the spatio-temporal behavior of generated synthetic earthquake catalogs with respect to simplification (e.g. simple two-fault cases) as well as to complication (e.g. hidden faults, geometric complexity, heterogeneities of constitutive parameters).

  11. A Simple Model of Pulsed Ejector Thrust Augmentation

    NASA Technical Reports Server (NTRS)

    Wilson, Jack; Deloof, Richard L. (Technical Monitor)

    2003-01-01

    A simple model of thrust augmentation from a pulsed source is described. In the model it is assumed that the flow into the ejector is quasi-steady, and can be calculated using potential flow techniques. The velocity of the flow is related to the speed of the starting vortex ring formed by the jet. The vortex ring properties are obtained from the slug model, knowing the jet diameter, speed and slug length. The model, when combined with experimental results, predicts an optimum ejector radius for thrust augmentation. Data on pulsed ejector performance for comparison with the model was obtained using a shrouded Hartmann-Sprenger tube as the pulsed jet source. A statistical experiment, in which ejector length, diameter, and nose radius were independent parameters, was performed at four different frequencies. These frequencies corresponded to four different slug length to diameter ratios, two below cut-off, and two above. Comparison of the model with the experimental data showed reasonable agreement. Maximum pulsed thrust augmentation is shown to occur for a pulsed source with slug length to diameter ratio equal to the cut-off value.

  12. Enhancing communication with distressed patients, families and colleagues: the value of the Simple Skills Secrets model of communication for the nursing and healthcare workforce.

    PubMed

    Jack, Barbara A; O'Brien, Mary R; Kirton, Jennifer A; Marley, Kate; Whelan, Alison; Baldry, Catherine R; Groves, Karen E

    2013-12-01

    Good communication skills in healthcare professionals are acknowledged as a core competency. The consequences of poor communication are well-recognised with far reaching costs including; reduced treatment compliance, higher psychological morbidity, incorrect or delayed diagnoses, and increased complaints. The Simple Skills Secrets is a visual, easily memorised, model of communication for healthcare staff to respond to the distress or unanswerable questions of patients, families and colleagues. To explore the impact of the Simple Skills Secrets model of communication training on the general healthcare workforce. An evaluation methodology encompassing a quantitative pre- and post-course testing of confidence and willingness to have conversations with distressed patients, carers and colleagues and qualitative semi-structured telephone interviews with participants 6-8 weeks post course. During the evaluation, 153 staff undertook the training of which 149 completed the pre- and post-training questionnaire. A purposive sampling approach was adopted for the follow up qualitative interviews and 14 agreed to participate. There is a statistically significant improvement in both willingness and confidence for all categories; (overall confidence score, t(148)=-15.607, p=<0.05 overall willingness score, t(148)=-10.878, p=<0.05) with the greatest improvement in confidence in communicating with carers (pre-course mean 6.171 to post course mean 8.171). There is no statistical significant difference between the registered and support staff. Several themes were obtained from the qualitative data, including: a method of communicating differently, a structured approach, thinking differently and additional skills. The value of the model in clinical practice was reported. This model can be suggested as increasing the confidence of staff, in dealing with a myriad of situations which, if handled appropriately can lead to increased patient and carers' satisfaction. Empowering staff appears to have increased their willingness to undertake these conversations, which could lead to earlier intervention and minimise distress. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Impact resistance of fiber composites - Energy-absorbing mechanisms and environmental effects

    NASA Technical Reports Server (NTRS)

    Chamis, C. C.; Sinclair, J. H.

    1985-01-01

    Energy absorbing mechanisms were identified by several approaches. The energy absorbing mechanisms considered are those in unidirectional composite beams subjected to impact. The approaches used include: mechanic models, statistical models, transient finite element analysis, and simple beam theory. Predicted results are correlated with experimental data from Charpy impact tests. The environmental effects on impact resistance are evaluated. Working definitions for energy absorbing and energy releasing mechanisms are proposed and a dynamic fracture progression is outlined. Possible generalizations to angle-plied laminates are described.

  14. Impact resistance of fiber composites: Energy absorbing mechanisms and environmental effects

    NASA Technical Reports Server (NTRS)

    Chamis, C. C.; Sinclair, J. H.

    1983-01-01

    Energy absorbing mechanisms were identified by several approaches. The energy absorbing mechanisms considered are those in unidirectional composite beams subjected to impact. The approaches used include: mechanic models, statistical models, transient finite element analysis, and simple beam theory. Predicted results are correlated with experimental data from Charpy impact tests. The environmental effects on impact resistance are evaluated. Working definitions for energy absorbing and energy releasing mechanisms are proposed and a dynamic fracture progression is outlined. Possible generalizations to angle-plied laminates are described.

  15. Finite-sample and asymptotic sign-based tests for parameters of non-linear quantile regression with Markov noise

    NASA Astrophysics Data System (ADS)

    Sirenko, M. A.; Tarasenko, P. F.; Pushkarev, M. I.

    2017-01-01

    One of the most noticeable features of sign-based statistical procedures is an opportunity to build an exact test for simple hypothesis testing of parameters in a regression model. In this article, we expanded a sing-based approach to the nonlinear case with dependent noise. The examined model is a multi-quantile regression, which makes it possible to test hypothesis not only of regression parameters, but of noise parameters as well.

  16. Linear mixed-effects models for within-participant psychology experiments: an introductory tutorial and free, graphical user interface (LMMgui).

    PubMed

    Magezi, David A

    2015-01-01

    Linear mixed-effects models (LMMs) are increasingly being used for data analysis in cognitive neuroscience and experimental psychology, where within-participant designs are common. The current article provides an introductory review of the use of LMMs for within-participant data analysis and describes a free, simple, graphical user interface (LMMgui). LMMgui uses the package lme4 (Bates et al., 2014a,b) in the statistical environment R (R Core Team).

  17. MAI statistics estimation and analysis in a DS-CDMA system

    NASA Astrophysics Data System (ADS)

    Alami Hassani, A.; Zouak, M.; Mrabti, M.; Abdi, F.

    2018-05-01

    A primary limitation of Direct Sequence Code Division Multiple Access DS-CDMA link performance and system capacity is multiple access interference (MAI). To examine the performance of CDMA systems in the presence of MAI, i.e., in a multiuser environment, several works assumed that the interference can be approximated by a Gaussian random variable. In this paper, we first develop a new and simple approach to characterize the MAI in a multiuser system. In addition to statistically quantifying the MAI power, the paper also proposes a statistical model for both variance and mean of the MAI for synchronous and asynchronous CDMA transmission. We show that the MAI probability density function (PDF) is Gaussian for the equal-received-energy case and validate it by computer simulations.

  18. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm.

    PubMed

    Raykov, Yordan P; Boukouvalas, Alexis; Baig, Fahd; Little, Max A

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.

  19. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm

    PubMed Central

    Baig, Fahd; Little, Max A.

    2016-01-01

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism. PMID:27669525

  20. Repeatability Modeling for Wind-Tunnel Measurements: Results for Three Langley Facilities

    NASA Technical Reports Server (NTRS)

    Hemsch, Michael J.; Houlden, Heather P.

    2014-01-01

    Data from extensive check standard tests of seven measurement processes in three NASA Langley Research Center wind tunnels are statistically analyzed to test a simple model previously presented in 2000 for characterizing short-term, within-test and across-test repeatability. The analysis is intended to support process improvement and development of uncertainty models for the measurements. The analysis suggests that the repeatability can be estimated adequately as a function of only the test section dynamic pressure over a two-orders- of-magnitude dynamic pressure range. As expected for low instrument loading, short-term coefficient repeatability is determined by the resolution of the instrument alone (air off). However, as previously pointed out, for the highest dynamic pressure range the coefficient repeatability appears to be independent of dynamic pressure, thus presenting a lower floor for the standard deviation for all three time frames. The simple repeatability model is shown to be adequate for all of the cases presented and for all three time frames.

  1. The Population Tracking Model: A Simple, Scalable Statistical Model for Neural Population Data

    PubMed Central

    O'Donnell, Cian; alves, J. Tiago Gonç; Whiteley, Nick; Portera-Cailliau, Carlos; Sejnowski, Terrence J.

    2017-01-01

    Our understanding of neural population coding has been limited by a lack of analysis methods to characterize spiking data from large populations. The biggest challenge comes from the fact that the number of possible network activity patterns scales exponentially with the number of neurons recorded (∼2Neurons). Here we introduce a new statistical method for characterizing neural population activity that requires semi-independent fitting of only as many parameters as the square of the number of neurons, requiring drastically smaller data sets and minimal computation time. The model works by matching the population rate (the number of neurons synchronously active) and the probability that each individual neuron fires given the population rate. We found that this model can accurately fit synthetic data from up to 1000 neurons. We also found that the model could rapidly decode visual stimuli from neural population data from macaque primary visual cortex about 65 ms after stimulus onset. Finally, we used the model to estimate the entropy of neural population activity in developing mouse somatosensory cortex and, surprisingly, found that it first increases, and then decreases during development. This statistical model opens new options for interrogating neural population data and can bolster the use of modern large-scale in vivo Ca2+ and voltage imaging tools. PMID:27870612

  2. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  3. Statistical Selection of Biological Models for Genome-Wide Association Analyses.

    PubMed

    Bi, Wenjian; Kang, Guolian; Pounds, Stanley B

    2018-05-24

    Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.

  4. Heuristic Identification of Biological Architectures for Simulating Complex Hierarchical Genetic Interactions

    PubMed Central

    Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C

    2015-01-01

    Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175

  5. Simulating statistics of lightning-induced and man made fires

    NASA Astrophysics Data System (ADS)

    Krenn, R.; Hergarten, S.

    2009-04-01

    The frequency-area distributions of forest fires show power-law behavior with scaling exponents α in a quite narrow range, relating wildfire research to the theoretical framework of self-organized criticality. Examples of self-organized critical behavior can be found in computer simulations of simple cellular automata. The established self-organized critical Drossel-Schwabl forest fire model (DS-FFM) is one of the most widespread models in this context. Despite its qualitative agreement with event-size statistics from nature, its applicability is still questioned. Apart from general concerns that the DS-FFM apparently oversimplifies the complex nature of forest dynamics, it significantly overestimates the frequency of large fires. We present a straightforward modification of the model rules that increases the scaling exponent α by approximately 1•3 and brings the simulated event-size statistics close to those observed in nature. In addition, combined simulations of both the original and the modified model predict a dependence of the overall distribution on the ratio of lightning induced and man made fires as well as a difference between their respective event-size statistics. The increase of the scaling exponent with decreasing lightning probability as well as the splitting of the partial distributions are confirmed by the analysis of the Canadian Large Fire Database. As a consequence, lightning induced and man made forest fires cannot be treated separately in wildfire modeling, hazard assessment and forest management.

  6. Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics

    PubMed Central

    Coen-Cagli, Ruben; Dayan, Peter; Schwartz, Odelia

    2012-01-01

    Spatial context in images induces perceptual phenomena associated with salience and modulates the responses of neurons in primary visual cortex (V1). However, the computational and ecological principles underlying contextual effects are incompletely understood. We introduce a model of natural images that includes grouping and segmentation of neighboring features based on their joint statistics, and we interpret the firing rates of V1 neurons as performing optimal recognition in this model. We show that this leads to a substantial generalization of divisive normalization, a computation that is ubiquitous in many neural areas and systems. A main novelty in our model is that the influence of the context on a target stimulus is determined by their degree of statistical dependence. We optimized the parameters of the model on natural image patches, and then simulated neural and perceptual responses on stimuli used in classical experiments. The model reproduces some rich and complex response patterns observed in V1, such as the contrast dependence, orientation tuning and spatial asymmetry of surround suppression, while also allowing for surround facilitation under conditions of weak stimulation. It also mimics the perceptual salience produced by simple displays, and leads to readily testable predictions. Our results provide a principled account of orientation-based contextual modulation in early vision and its sensitivity to the homogeneity and spatial arrangement of inputs, and lends statistical support to the theory that V1 computes visual salience. PMID:22396635

  7. Does Specification Matter? Experiments with Simple Multiregional Probabilistic Population Projections

    PubMed Central

    Raymer, James; Abel, Guy J.; Rogers, Andrei

    2012-01-01

    Population projection models that introduce uncertainty are a growing subset of projection models in general. In this paper, we focus on the importance of decisions made with regard to the model specifications adopted. We compare the forecasts and prediction intervals associated with four simple regional population projection models: an overall growth rate model, a component model with net migration, a component model with in-migration and out-migration rates, and a multiregional model with destination-specific out-migration rates. Vector autoregressive models are used to forecast future rates of growth, birth, death, net migration, in-migration and out-migration, and destination-specific out-migration for the North, Midlands and South regions in England. They are also used to forecast different international migration measures. The base data represent a time series of annual data provided by the Office for National Statistics from 1976 to 2008. The results illustrate how both the forecasted subpopulation totals and the corresponding prediction intervals differ for the multiregional model in comparison to other simpler models, as well as for different assumptions about international migration. The paper ends end with a discussion of our results and possible directions for future research. PMID:23236221

  8. Variety and volatility in financial markets

    NASA Astrophysics Data System (ADS)

    Lillo, Fabrizio; Mantegna, Rosario N.

    2000-11-01

    We study the price dynamics of stocks traded in a financial market by considering the statistical properties of both a single time series and an ensemble of stocks traded simultaneously. We use the n stocks traded on the New York Stock Exchange to form a statistical ensemble of daily stock returns. For each trading day of our database, we study the ensemble return distribution. We find that a typical ensemble return distribution exists in most of the trading days with the exception of crash and rally days and of the days following these extreme events. We analyze each ensemble return distribution by extracting its first two central moments. We observe that these moments fluctuate in time and are stochastic processes, themselves. We characterize the statistical properties of ensemble return distribution central moments by investigating their probability density functions and temporal correlation properties. In general, time-averaged and portfolio-averaged price returns have different statistical properties. We infer from these differences information about the relative strength of correlation between stocks and between different trading days. Last, we compare our empirical results with those predicted by the single-index model and we conclude that this simple model cannot explain the statistical properties of the second moment of the ensemble return distribution.

  9. Sampling methods to the statistical control of the production of blood components.

    PubMed

    Pereira, Paulo; Seghatchian, Jerard; Caldeira, Beatriz; Santos, Paula; Castro, Rosa; Fernandes, Teresa; Xavier, Sandra; de Sousa, Gracinda; de Almeida E Sousa, João Paulo

    2017-12-01

    The control of blood components specifications is a requirement generalized in Europe by the European Commission Directives and in the US by the AABB standards. The use of a statistical process control methodology is recommended in the related literature, including the EDQM guideline. The control reliability is dependent of the sampling. However, a correct sampling methodology seems not to be systematically applied. Commonly, the sampling is intended to comply uniquely with the 1% specification to the produced blood components. Nevertheless, on a purely statistical viewpoint, this model could be argued not to be related to a consistent sampling technique. This could be a severe limitation to detect abnormal patterns and to assure that the production has a non-significant probability of producing nonconforming components. This article discusses what is happening in blood establishments. Three statistical methodologies are proposed: simple random sampling, sampling based on the proportion of a finite population, and sampling based on the inspection level. The empirical results demonstrate that these models are practicable in blood establishments contributing to the robustness of sampling and related statistical process control decisions for the purpose they are suggested for. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. An accurate behavioral model for single-photon avalanche diode statistical performance simulation

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Zhao, Tingchen; Li, Ding

    2018-01-01

    An accurate behavioral model is presented to simulate important statistical performance of single-photon avalanche diodes (SPADs), such as dark count and after-pulsing noise. The derived simulation model takes into account all important generation mechanisms of the two kinds of noise. For the first time, thermal agitation, trap-assisted tunneling and band-to-band tunneling mechanisms are simultaneously incorporated in the simulation model to evaluate dark count behavior of SPADs fabricated in deep sub-micron CMOS technology. Meanwhile, a complete carrier trapping and de-trapping process is considered in afterpulsing model and a simple analytical expression is derived to estimate after-pulsing probability. In particular, the key model parameters of avalanche triggering probability and electric field dependence of excess bias voltage are extracted from Geiger-mode TCAD simulation and this behavioral simulation model doesn't include any empirical parameters. The developed SPAD model is implemented in Verilog-A behavioral hardware description language and successfully operated on commercial Cadence Spectre simulator, showing good universality and compatibility. The model simulation results are in a good accordance with the test data, validating high simulation accuracy.

  11. ecode - Electron Transport Algorithm Testing v. 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Franke, Brian C.; Olson, Aaron J.; Bruss, Donald Eugene

    2016-10-05

    ecode is a Monte Carlo code used for testing algorithms related to electron transport. The code can read basic physics parameters, such as energy-dependent stopping powers and screening parameters. The code permits simple planar geometries of slabs or cubes. Parallelization consists of domain replication, with work distributed at the start of the calculation and statistical results gathered at the end of the calculation. Some basic routines (such as input parsing, random number generation, and statistics processing) are shared with the Integrated Tiger Series codes. A variety of algorithms for uncertainty propagation are incorporated based on the stochastic collocation and stochasticmore » Galerkin methods. These permit uncertainty only in the total and angular scattering cross sections. The code contains algorithms for simulating stochastic mixtures of two materials. The physics is approximate, ranging from mono-energetic and isotropic scattering to screened Rutherford angular scattering and Rutherford energy-loss scattering (simple electron transport models). No production of secondary particles is implemented, and no photon physics is implemented.« less

  12. Blended particle filters for large-dimensional chaotic dynamical systems

    PubMed Central

    Majda, Andrew J.; Qi, Di; Sapsis, Themistoklis P.

    2014-01-01

    A major challenge in contemporary data science is the development of statistically accurate particle filters to capture non-Gaussian features in large-dimensional chaotic dynamical systems. Blended particle filters that capture non-Gaussian features in an adaptively evolving low-dimensional subspace through particles interacting with evolving Gaussian statistics on the remaining portion of phase space are introduced here. These blended particle filters are constructed in this paper through a mathematical formalism involving conditional Gaussian mixtures combined with statistically nonlinear forecast models compatible with this structure developed recently with high skill for uncertainty quantification. Stringent test cases for filtering involving the 40-dimensional Lorenz 96 model with a 5-dimensional adaptive subspace for nonlinear blended filtering in various turbulent regimes with at least nine positive Lyapunov exponents are used here. These cases demonstrate the high skill of the blended particle filter algorithms in capturing both highly non-Gaussian dynamical features as well as crucial nonlinear statistics for accurate filtering in extreme filtering regimes with sparse infrequent high-quality observations. The formalism developed here is also useful for multiscale filtering of turbulent systems and a simple application is sketched below. PMID:24825886

  13. Statistical sensitivity analysis of a simple nuclear waste repository model

    NASA Astrophysics Data System (ADS)

    Ronen, Y.; Lucius, J. L.; Blow, E. M.

    1980-06-01

    A preliminary step in a comprehensive sensitivity analysis of the modeling of a nuclear waste repository. The purpose of the complete analysis is to determine which modeling parameters and physical data are most important in determining key design performance criteria and then to obtain the uncertainty in the design for safety considerations. The theory for a statistical screening design methodology is developed for later use in the overall program. The theory was applied to the test case of determining the relative importance of the sensitivity of near field temperature distribution in a single level salt repository to modeling parameters. The exact values of the sensitivities to these physical and modeling parameters were then obtained using direct methods of recalculation. The sensitivity coefficients found to be important for the sample problem were thermal loading, distance between the spent fuel canisters and their radius. Other important parameters were those related to salt properties at a point of interest in the repository.

  14. An alternative way to evaluate chemistry-transport model variability

    NASA Astrophysics Data System (ADS)

    Menut, Laurent; Mailler, Sylvain; Bessagnet, Bertrand; Siour, Guillaume; Colette, Augustin; Couvidat, Florian; Meleux, Frédérik

    2017-03-01

    A simple and complementary model evaluation technique for regional chemistry transport is discussed. The methodology is based on the concept that we can learn about model performance by comparing the simulation results with observational data available for time periods other than the period originally targeted. First, the statistical indicators selected in this study (spatial and temporal correlations) are computed for a given time period, using colocated observation and simulation data in time and space. Second, the same indicators are used to calculate scores for several other years while conserving the spatial locations and Julian days of the year. The difference between the results provides useful insights on the model capability to reproduce the observed day-to-day and spatial variability. In order to synthesize the large amount of results, a new indicator is proposed, designed to compare several error statistics between all the years of validation and to quantify whether the period and area being studied were well captured by the model for the correct reasons.

  15. Stochastic Spatial Models in Ecology: A Statistical Physics Approach

    NASA Astrophysics Data System (ADS)

    Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.

    2018-07-01

    Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.

  16. Combining forecast weights: Why and how?

    NASA Astrophysics Data System (ADS)

    Yin, Yip Chee; Kok-Haur, Ng; Hock-Eam, Lim

    2012-09-01

    This paper proposes a procedure called forecast weight averaging which is a specific combination of forecast weights obtained from different methods of constructing forecast weights for the purpose of improving the accuracy of pseudo out of sample forecasting. It is found that under certain specified conditions, forecast weight averaging can lower the mean squared forecast error obtained from model averaging. In addition, we show that in a linear and homoskedastic environment, this superior predictive ability of forecast weight averaging holds true irrespective whether the coefficients are tested by t statistic or z statistic provided the significant level is within the 10% range. By theoretical proofs and simulation study, we have shown that model averaging like, variance model averaging, simple model averaging and standard error model averaging, each produces mean squared forecast error larger than that of forecast weight averaging. Finally, this result also holds true marginally when applied to business and economic empirical data sets, Gross Domestic Product (GDP growth rate), Consumer Price Index (CPI) and Average Lending Rate (ALR) of Malaysia.

  17. Stochastic Spatial Models in Ecology: A Statistical Physics Approach

    NASA Astrophysics Data System (ADS)

    Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.

    2017-11-01

    Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.

  18. Statistical mechanical model of gas adsorption in porous crystals with dynamic moieties

    PubMed Central

    Braun, Efrem; Carraro, Carlo; Smit, Berend

    2017-01-01

    Some nanoporous, crystalline materials possess dynamic constituents, for example, rotatable moieties. These moieties can undergo a conformation change in response to the adsorption of guest molecules, which qualitatively impacts adsorption behavior. We pose and solve a statistical mechanical model of gas adsorption in a porous crystal whose cages share a common ligand that can adopt two distinct rotational conformations. Guest molecules incentivize the ligands to adopt a different rotational configuration than maintained in the empty host. Our model captures inflections, steps, and hysteresis that can arise in the adsorption isotherm as a signature of the rotating ligands. The insights disclosed by our simple model contribute a more intimate understanding of the response and consequence of rotating ligands integrated into porous materials to harness them for gas storage and separations, chemical sensing, drug delivery, catalysis, and nanoscale devices. Particularly, our model reveals design strategies to exploit these moving constituents and engineer improved adsorbents with intrinsic thermal management for pressure-swing adsorption processes. PMID:28049851

  19. Statistical mechanical model of gas adsorption in porous crystals with dynamic moieties.

    PubMed

    Simon, Cory M; Braun, Efrem; Carraro, Carlo; Smit, Berend

    2017-01-17

    Some nanoporous, crystalline materials possess dynamic constituents, for example, rotatable moieties. These moieties can undergo a conformation change in response to the adsorption of guest molecules, which qualitatively impacts adsorption behavior. We pose and solve a statistical mechanical model of gas adsorption in a porous crystal whose cages share a common ligand that can adopt two distinct rotational conformations. Guest molecules incentivize the ligands to adopt a different rotational configuration than maintained in the empty host. Our model captures inflections, steps, and hysteresis that can arise in the adsorption isotherm as a signature of the rotating ligands. The insights disclosed by our simple model contribute a more intimate understanding of the response and consequence of rotating ligands integrated into porous materials to harness them for gas storage and separations, chemical sensing, drug delivery, catalysis, and nanoscale devices. Particularly, our model reveals design strategies to exploit these moving constituents and engineer improved adsorbents with intrinsic thermal management for pressure-swing adsorption processes.

  20. Statistical Issues for Uncontrolled Reentry Hazards Empirical Tests of the Predicted Footprint for Uncontrolled Satellite Reentry Hazards

    NASA Technical Reports Server (NTRS)

    Matney, Mark

    2011-01-01

    A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, material, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. Because this information is used in making policy and engineering decisions, it is important that these assumptions be tested using empirical data. This study uses the latest database of known uncontrolled reentry locations measured by the United States Department of Defense. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors in the final stages of reentry - including the effects of gravitational harmonics, the effects of the Earth s equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and possibly change the probability of reentering over a given location. In this paper, the measured latitude and longitude distributions of these objects are directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.

  1. Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages.

    PubMed

    Choi, Youn-Kyung; Kim, Jinmi; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Kim, Yong-Il

    2016-01-01

    This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5-18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level.

  2. Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages

    PubMed Central

    Choi, Youn-Kyung; Kim, Jinmi; Maki, Koutaro; Ko, Ching-Chang

    2016-01-01

    This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5–18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level. PMID:27340668

  3. Development of the Concept of Energy Conservation using Simple Experiments for Grade 10 Students

    NASA Astrophysics Data System (ADS)

    Rachniyom, S.; Toedtanya, K.; Wuttiprom, S.

    2017-09-01

    The purpose of this research was to develop students’ concept of and retention rate in relation to energy conservation. Activities included simple and easy experiments that considered energy transformation from potential to kinetic energy. The participants were 30 purposively selected grade 10 students in the second semester of the 2016 academic year. The research tools consisted of learning lesson plans and a learning achievement test. Results showed that the experiments worked well and were appropriate as learning activities. The students’ achievement scores significantly increased at the statistical level of 05, the students’ retention rates were at a high level, and learning behaviour was at a good level. These simple experiments allowed students to learn to demonstrate to their peers and encouraged them to use familiar models to explain phenomena in daily life.

  4. Simulating Metabolism with Statistical Thermodynamics

    PubMed Central

    Cannon, William R.

    2014-01-01

    New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed. PMID:25089525

  5. Simulating metabolism with statistical thermodynamics.

    PubMed

    Cannon, William R

    2014-01-01

    New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed.

  6. Prediction of N-nitrosodimethylamine (NDMA) formation as a disinfection by-product.

    PubMed

    Kim, Jongo; Clevenger, Thomas E

    2007-06-25

    This study investigated the possibility of a statistical model application for the prediction of N-nitrosodimethylamine (NDMA) formation. The NDMA formation was studied as a function of monochloramine concentration (0.001-5mM) at fixed dimethylamine (DMA) concentrations of 0.01mM or 0.05mM. Excellent linear correlations were observed between the molar ratio of monochloramine to DMA and the NDMA formation on a log scale at pH 7 and 8. When a developed prediction equation was applied to a previously reported study, a good result was obtained. The statistical model appears to predict adequately NDMA concentrations if other NDMA precursors are excluded. Using the predictive tool, a simple and approximate calculation of NDMA formation can be obtained in drinking water systems.

  7. An investigation into the causes of stratospheric ozone loss in the southern Australasian region

    NASA Astrophysics Data System (ADS)

    Lehmann, P.; Karoly, D. J.; Newmann, P. A.; Clarkson, T. S.; Matthews, W. A.

    1992-07-01

    Measurements of total ozone at Macquarie Island (55 deg S, 159 deg E) reveal statistically significant reductions of approximately twelve percent during July to September when comparing the mean levels for 1987-90 with those in the seventies. In order to investigate the possibility that these ozone changes may not be a result of dynamic variability of the stratosphere, a simple linear model of ozone was created from statistical analysis of tropopause height and isentropic transient eddy heat flux, which were assumed representative of the dominant dynamic influences. Comparison of measured and modeled ozone indicates that the recent downward trend in ozone at Macquarie Island is not related to stratospheric dynamic variability and therefore suggests another mechanism, possibly changes in photochemical destruction of ozone.

  8. Modeling and replicating statistical topology and evidence for CMB nonhomogeneity

    PubMed Central

    Agami, Sarit

    2017-01-01

    Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301

  9. Variational Approach in the Theory of Liquid-Crystal State

    NASA Astrophysics Data System (ADS)

    Gevorkyan, E. V.

    2018-03-01

    The variational calculus by Leonhard Euler is the basis for modern mathematics and theoretical physics. The efficiency of variational approach in statistical theory of liquid-crystal state and in general case in condensed state theory is shown. The developed approach in particular allows us to introduce correctly effective pair interactions and optimize the simple models of liquid crystals with help of realistic intermolecular potentials.

  10. Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

    ERIC Educational Resources Information Center

    Jacob, Brian A.

    2016-01-01

    Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…

  11. Statistically Derived System Relationship Models for the SASSY Management Unit, 1st Force Service Support Group, Camp Pendelton, California.

    DTIC Science & Technology

    1981-06-01

    TI - 59 programmable calculator to aid...training. The Texas Instruments TI - 59 Programmable Calculator has only ten lettered registers that would be simple for clerical personnel to use (A...SASSY Management Units. Appendix C is a set of user instructions written for the Texas Instrument TI - 59 Programmable Calculator . The TI-59 was

  12. The change and development of statistical methods used in research articles in child development 1930-2010.

    PubMed

    Køppe, Simo; Dammeyer, Jesper

    2014-09-01

    The evolution of developmental psychology has been characterized by the use of different quantitative and qualitative methods and procedures. But how does the use of methods and procedures change over time? This study explores the change and development of statistical methods used in articles published in Child Development from 1930 to 2010. The methods used in every article in the first issue of every volume were categorized into four categories. Until 1980 relatively simple statistical methods were used. During the last 30 years there has been an explosive use of more advanced statistical methods employed. The absence of statistical methods or use of simple methods had been eliminated.

  13. Preliminary comparative assessment of PM10 hourly measurement results from new monitoring stations type using stochastic and exploratory methodology and models

    NASA Astrophysics Data System (ADS)

    Czechowski, Piotr Oskar; Owczarek, Tomasz; Badyda, Artur; Majewski, Grzegorz; Rogulski, Mariusz; Ogrodnik, Paweł

    2018-01-01

    The paper presents selected preliminary stage key issues proposed extended equivalence measurement results assessment for new portable devices - the comparability PM10 concentration results hourly series with reference station measurement results with statistical methods. In article presented new portable meters technical aspects. The emphasis was placed on the comparability the results using the stochastic and exploratory methods methodology concept. The concept is based on notice that results series simple comparability in the time domain is insufficient. The comparison of regularity should be done in three complementary fields of statistical modeling: time, frequency and space. The proposal is based on model's results of five annual series measurement results new mobile devices and WIOS (Provincial Environmental Protection Inspectorate) reference station located in Nowy Sacz city. The obtained results indicate both the comparison methodology completeness and the high correspondence obtained new measurements results devices with reference.

  14. Statistical Modeling of Robotic Random Walks on Different Terrain

    NASA Astrophysics Data System (ADS)

    Naylor, Austin; Kinnaman, Laura

    Issues of public safety, especially with crowd dynamics and pedestrian movement, have been modeled by physicists using methods from statistical mechanics over the last few years. Complex decision making of humans moving on different terrains can be modeled using random walks (RW) and correlated random walks (CRW). The effect of different terrains, such as a constant increasing slope, on RW and CRW was explored. LEGO robots were programmed to make RW and CRW with uniform step sizes. Level ground tests demonstrated that the robots had the expected step size distribution and correlation angles (for CRW). The mean square displacement was calculated for each RW and CRW on different terrains and matched expected trends. The step size distribution was determined to change based on the terrain; theoretical predictions for the step size distribution were made for various simple terrains. It's Dr. Laura Kinnaman, not sure where to put the Prefix.

  15. Quantum error-correction failure distributions: Comparison of coherent and stochastic error models

    NASA Astrophysics Data System (ADS)

    Barnes, Jeff P.; Trout, Colin J.; Lucarelli, Dennis; Clader, B. D.

    2017-06-01

    We compare failure distributions of quantum error correction circuits for stochastic errors and coherent errors. We utilize a fully coherent simulation of a fault-tolerant quantum error correcting circuit for a d =3 Steane and surface code. We find that the output distributions are markedly different for the two error models, showing that no simple mapping between the two error models exists. Coherent errors create very broad and heavy-tailed failure distributions. This suggests that they are susceptible to outlier events and that mean statistics, such as pseudothreshold estimates, may not provide the key figure of merit. This provides further statistical insight into why coherent errors can be so harmful for quantum error correction. These output probability distributions may also provide a useful metric that can be utilized when optimizing quantum error correcting codes and decoding procedures for purely coherent errors.

  16. Seasonal Synchronization of a Simple Stochastic Dynamical Model Capturing El Niño Diversity

    NASA Astrophysics Data System (ADS)

    Thual, S.; Majda, A.; Chen, N.

    2017-12-01

    The El Niño-Southern Oscillation (ENSO) has significant impact on global climate and seasonal prediction. Recently, a simple ENSO model was developed that automatically captures the ENSO diversity and intermittency in nature, where state-dependent stochastic wind bursts and nonlinear advection of sea surface temperature (SST) are coupled to simple ocean-atmosphere processes that are otherwise deterministic, linear and stable. In the present article, it is further shown that the model can reproduce qualitatively the ENSO synchronization (or phase-locking) to the seasonal cycle in nature. This goal is achieved by incorporating a cloud radiative feedback that is derived naturally from the model's atmosphere dynamics with no ad-hoc assumptions and accounts in simple fashion for the marked seasonal variations of convective activity and cloud cover in the eastern Pacific. In particular, the weak convective response to SSTs in boreal fall favors the eastern Pacific warming that triggers El Niño events while the increased convective activity and cloud cover during the following spring contributes to the shutdown of those events by blocking incoming shortwave solar radiations. In addition to simulating the ENSO diversity with realistic non-Gaussian statistics in different Niño regions, both the eastern Pacific moderate and super El Niño, the central Pacific El Niño as well as La Niña show a realistic chronology with a tendency to peak in boreal winter as well as decreased predictability in spring consistent with the persistence barrier in nature. The incorporation of other possible seasonal feedbacks in the model is also documented for completeness.

  17. Equivalent circuit models for interpreting impedance perturbation spectroscopy data

    NASA Astrophysics Data System (ADS)

    Smith, R. Lowell

    2004-07-01

    As in-situ structural integrity monitoring disciplines mature, there is a growing need to process sensor/actuator data efficiently in real time. Although smaller, faster embedded processors will contribute to this, it is also important to develop straightforward, robust methods to reduce the overall computational burden for practical applications of interest. This paper addresses the use of equivalent circuit modeling techniques for inferring structure attributes monitored using impedance perturbation spectroscopy. In pioneering work about ten years ago significant progress was associated with the development of simple impedance models derived from the piezoelectric equations. Using mathematical modeling tools currently available from research in ultrasonics and impedance spectroscopy is expected to provide additional synergistic benefits. For purposes of structural health monitoring the objective is to use impedance spectroscopy data to infer the physical condition of structures to which small piezoelectric actuators are bonded. Features of interest include stiffness changes, mass loading, and damping or mechanical losses. Equivalent circuit models are typically simple enough to facilitate the development of practical analytical models of the actuator-structure interaction. This type of parametric structure model allows raw impedance/admittance data to be interpreted optimally using standard multiple, nonlinear regression analysis. One potential long-term outcome is the possibility of cataloging measured viscoelastic properties of the mechanical subsystems of interest as simple lists of attributes and their statistical uncertainties, whose evolution can be followed in time. Equivalent circuit models are well suited for addressing calibration and self-consistency issues such as temperature corrections, Poisson mode coupling, and distributed relaxation processes.

  18. Universal Capacitance Model for Real-Time Biomass in Cell Culture.

    PubMed

    Konakovsky, Viktor; Yagtu, Ali Civan; Clemens, Christoph; Müller, Markus Michael; Berger, Martina; Schlatter, Stefan; Herwig, Christoph

    2015-09-02

    : Capacitance probes have the potential to revolutionize bioprocess control due to their safe and robust use and ability to detect even the smallest capacitors in the form of biological cells. Several techniques have evolved to model biomass statistically, however, there are problems with model transfer between cell lines and process conditions. Errors of transferred models in the declining phase of the culture range for linear models around +100% or worse, causing unnecessary delays with test runs during bioprocess development. The goal of this work was to develop one single universal model which can be adapted by considering a potentially mechanistic factor to estimate biomass in yet untested clones and scales. The novelty of this work is a methodology to select sensitive frequencies to build a statistical model which can be shared among fermentations with an error between 9% and 38% (mean error around 20%) for the whole process, including the declining phase. A simple linear factor was found to be responsible for the transferability of biomass models between cell lines, indicating a link to their phenotype or physiology.

  19. Evaluating statistical cloud schemes: What can we gain from ground-based remote sensing?

    NASA Astrophysics Data System (ADS)

    Grützun, V.; Quaas, J.; Morcrette, C. J.; Ament, F.

    2013-09-01

    Statistical cloud schemes with prognostic probability distribution functions have become more important in atmospheric modeling, especially since they are in principle scale adaptive and capture cloud physics in more detail. While in theory the schemes have a great potential, their accuracy is still questionable. High-resolution three-dimensional observational data of water vapor and cloud water, which could be used for testing them, are missing. We explore the potential of ground-based remote sensing such as lidar, microwave, and radar to evaluate prognostic distribution moments using the "perfect model approach." This means that we employ a high-resolution weather model as virtual reality and retrieve full three-dimensional atmospheric quantities and virtual ground-based observations. We then use statistics from the virtual observation to validate the modeled 3-D statistics. Since the data are entirely consistent, any discrepancy occurring is due to the method. Focusing on total water mixing ratio, we find that the mean ratio can be evaluated decently but that it strongly depends on the meteorological conditions as to whether the variance and skewness are reliable. Using some simple schematic description of different synoptic conditions, we show how statistics obtained from point or line measurements can be poor at representing the full three-dimensional distribution of water in the atmosphere. We argue that a careful analysis of measurement data and detailed knowledge of the meteorological situation is necessary to judge whether we can use the data for an evaluation of higher moments of the humidity distribution used by a statistical cloud scheme.

  20. Characterizing and Addressing the Need for Statistical Adjustment of Global Climate Model Data

    NASA Astrophysics Data System (ADS)

    White, K. D.; Baker, B.; Mueller, C.; Villarini, G.; Foley, P.; Friedman, D.

    2017-12-01

    As part of its mission to research and measure the effects of the changing climate, the U. S. Army Corps of Engineers (USACE) regularly uses the World Climate Research Programme's Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model dataset. However, these data are generated at a global level and are not fine-tuned for specific watersheds. This often causes CMIP5 output to vary from locally observed patterns in the climate. Several downscaling methods have been developed to increase the resolution of the CMIP5 data and decrease systemic differences to support decision-makers as they evaluate results at the watershed scale. Evaluating preliminary comparisons of observed and projected flow frequency curves over the US revealed a simple framework for water resources decision makers to plan and design water resources management measures under changing conditions using standard tools. Using this framework as a basis, USACE has begun to explore to use of statistical adjustment to alter global climate model data to better match the locally observed patterns while preserving the general structure and behavior of the model data. When paired with careful measurement and hypothesis testing, statistical adjustment can be particularly effective at navigating the compromise between the locally observed patterns and the global climate model structures for decision makers.

  1. Stability of procalcitonin at room temperature.

    PubMed

    Milcent, Karen; Poulalhon, Claire; Fellous, Christelle Vauloup; Petit, François; Bouyer, Jean; Gajdos, Vincent

    2014-01-01

    The aim was to assess procalcitonin (PCT) stability after two days of storage at room temperature. Samples were collected from febrile children aged 7 to 92 days and were rapidly frozen after sampling. PCT levels were measured twice after thawing: immediately (named y) and 48 hours later after storage at room temperature (named x). PCT values were described with medians and interquartile ranges or by categorizing them into classes with thresholds 0.25, 0.5, and 2 ng/mL. The relationship between x and y PCT levels was analyzed using fractional polynomials in order to predict the PCT value immediately after thawing (named y') from x. A significant decrease in PCT values was observed after 48 hours of storage at room temperature, either in median, 30% lowering (p < 0.001), or as categorical variable (p < 0.001). The relationship between x and y can be accurately modeled with a simple linear model: y = 1.37 x (R2 = 0.99). The median of the predicted PCT values y' was quantitatively very close to the median of y and the distributions of y and y' across categories were very similar and not statistically different. PCT levels noticeably decrease after 48 hours of storage at room temperature. It is possible to pre- dict accurately effective PCT values from the values after 48 hours of storage at room temperature with a simple statistical model.

  2. Statistical inference for noisy nonlinear ecological dynamic systems.

    PubMed

    Wood, Simon N

    2010-08-26

    Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.

  3. Describing temporal variation in reticuloruminal pH using continuous monitoring data.

    PubMed

    Denwood, M J; Kleen, J L; Jensen, D B; Jonsson, N N

    2018-01-01

    Reticuloruminal pH has been linked to subclinical disease in dairy cattle, leading to considerable interest in identifying pH observations below a given threshold. The relatively recent availability of continuously monitored data from pH boluses gives new opportunities for characterizing the normal patterns of pH over time and distinguishing these from abnormal patterns using more sensitive and specific methods than simple thresholds. We fitted a series of statistical models to continuously monitored data from 93 animals on 13 farms to characterize normal variation within and between animals. We used a subset of the data to relate deviations from the normal pattern to the productivity of 24 dairy cows from a single herd. Our findings show substantial variation in pH characteristics between animals, although animals within the same farm tended to show more consistent patterns. There was strong evidence for a predictable diurnal variation in all animals, and up to 70% of the observed variation in pH could be explained using a simple statistical model. For the 24 animals with available production information, there was also a strong association between productivity (as measured by both milk yield and dry matter intake) and deviations from the expected diurnal pattern of pH 2 d before the productivity observation. In contrast, there was no association between productivity and the occurrence of observations below a threshold pH. We conclude that statistical models can be used to account for a substantial proportion of the observed variability in pH and that future work with continuously monitored pH data should focus on deviations from a predictable pattern rather than the frequency of observations below an arbitrary pH threshold. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  4. A user-friendly risk-score for predicting in-hospital cardiac arrest among patients admitted with suspected non ST-elevation acute coronary syndrome - The SAFER-score.

    PubMed

    Faxén, Jonas; Hall, Marlous; Gale, Chris P; Sundström, Johan; Lindahl, Bertil; Jernberg, Tomas; Szummer, Karolina

    2017-12-01

    To develop a simple risk-score model for predicting in-hospital cardiac arrest (CA) among patients hospitalized with suspected non-ST elevation acute coronary syndrome (NSTE-ACS). Using the Swedish Web-system for Enhancement and Development of Evidence-based care in Heart disease Evaluated According to Recommended Therapies (SWEDEHEART), we identified patients (n=242 303) admitted with suspected NSTE-ACS between 2008 and 2014. Logistic regression was used to assess the association between 26 candidate variables and in-hospital CA. A risk-score model was developed and validated using a temporal cohort (n=126 073) comprising patients from SWEDEHEART between 2005 and 2007 and an external cohort (n=276 109) comprising patients from the Myocardial Ischaemia National Audit Project (MINAP) between 2008 and 2013. The incidence of in-hospital CA for NSTE-ACS and non-ACS was lower in the SWEDEHEART-derivation cohort than in MINAP (1.3% and 0.5% vs. 2.3% and 2.3%). A seven point, five variable risk score (age ≥60 years (1 point), ST-T abnormalities (2 points), Killip Class >1 (1 point), heart rate <50 or ≥100bpm (1 point), and systolic blood pressure <100mmHg (2 points) was developed. Model discrimination was good in the derivation cohort (c-statistic 0.72) and temporal validation cohort (c-statistic 0.74), and calibration was reasonable with a tendency towards overestimation of risk with a higher sum of score points. External validation showed moderate discrimination (c-statistic 0.65) and calibration showed a general underestimation of predicted risk. A simple points score containing five variables readily available on admission predicts in-hospital CA for patients with suspected NSTE-ACS. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Trends in Mortality After Primary Cytoreductive Surgery for Ovarian Cancer: A Systematic Review and Metaregression of Randomized Clinical Trials and Observational Studies.

    PubMed

    Di Donato, Violante; Kontopantelis, Evangelos; Aletti, Giovanni; Casorelli, Assunta; Piacenti, Ilaria; Bogani, Giorgio; Lecce, Francesca; Benedetti Panici, Pierluigi

    2017-06-01

    Primary cytoreductive surgery (PDS) followed by platinum-based chemotherapy is the cornerstone of treatment and the absence of residual tumor after PDS is universally considered the most important prognostic factor. The aim of the present analysis was to evaluate trend and predictors of 30-day mortality in patients undergoing primary cytoreduction for ovarian cancer. Literature was searched for records reporting 30-day mortality after PDS. All cohorts were rated for quality. Simple and multiple Poisson regression models were used to quantify the association between 30-day mortality and the following: overall or severe complications, proportion of patients with stage IV disease, median age, year of publication, and weighted surgical complexity index. Using the multiple regression model, we calculated the risk of perioperative mortality at different levels for statistically significant covariates of interest. Simple regression identified median age and proportion of patients with stage IV disease as statistically significant predictors of 30-day mortality. When included in the multiple Poisson regression model, both remained statistically significant, with an incidence rate ratio of 1.087 for median age and 1.017 for stage IV disease. Disease stage was a strong predictor, with the risk estimated to increase from 2.8% (95% confidence interval 2.02-3.66) for stage III to 16.1% (95% confidence interval 6.18-25.93) for stage IV, for a cohort with a median age of 65 years. Metaregression demonstrated that increased age and advanced clinical stage were independently associated with an increased risk of mortality, and the combined effects of both factors greatly increased the risk.

  6. Statistical iterative material image reconstruction for spectral CT using a semi-empirical forward model

    NASA Astrophysics Data System (ADS)

    Mechlem, Korbinian; Ehn, Sebastian; Sellerer, Thorsten; Pfeiffer, Franz; Noël, Peter B.

    2017-03-01

    In spectral computed tomography (spectral CT), the additional information about the energy dependence of attenuation coefficients can be exploited to generate material selective images. These images have found applications in various areas such as artifact reduction, quantitative imaging or clinical diagnosis. However, significant noise amplification on material decomposed images remains a fundamental problem of spectral CT. Most spectral CT algorithms separate the process of material decomposition and image reconstruction. Separating these steps is suboptimal because the full statistical information contained in the spectral tomographic measurements cannot be exploited. Statistical iterative reconstruction (SIR) techniques provide an alternative, mathematically elegant approach to obtaining material selective images with improved tradeoffs between noise and resolution. Furthermore, image reconstruction and material decomposition can be performed jointly. This is accomplished by a forward model which directly connects the (expected) spectral projection measurements and the material selective images. To obtain this forward model, detailed knowledge of the different photon energy spectra and the detector response was assumed in previous work. However, accurately determining the spectrum is often difficult in practice. In this work, a new algorithm for statistical iterative material decomposition is presented. It uses a semi-empirical forward model which relies on simple calibration measurements. Furthermore, an efficient optimization algorithm based on separable surrogate functions is employed. This partially negates one of the major shortcomings of SIR, namely high computational cost and long reconstruction times. Numerical simulations and real experiments show strongly improved image quality and reduced statistical bias compared to projection-based material decomposition.

  7. A stitch in time saves nine: suture technique does not affect intestinal growth in a young, growing animal model.

    PubMed

    Gurien, Lori A; Wyrick, Deidre L; Smith, Samuel D; Maxson, R Todd

    2016-05-01

    Although this issue remains unexamined, pediatric surgeons commonly use simple interrupted suture for bowel anastomosis, as it is thought to improve intestinal growth postoperatively compared to continuous running suture. However, effects on intestinal growth are unclear. We compared intestinal growth using different anastomotic techniques during the postoperative period in young rats. Young, growing rats underwent small bowel transection and anastomosis using either simple interrupted or continuous running technique. At 7-weeks postoperatively after a four-fold growth, the anastomotic site was resected. Diameters and burst pressures were measured. Thirteen rats underwent anastomosis with simple interrupted technique and sixteen with continuous running method. No differences were found in body weight at first (102.46 vs 109.75g) or second operations (413.85 vs 430.63g). Neither the diameters (0.69 vs 0.79cm) nor burst pressures were statistically different, although the calculated circumference was smaller in the simple interrupted group (2.18 vs 2.59cm; p=0.03). No ruptures occurred at the anastomotic line. This pilot study is the first to compare continuous running to simple interrupted intestinal anastomosis in a pediatric model and showed no difference in growth. Adopting continuous running techniques for bowel anastomosis in young children may lead to faster operative time without affecting intestinal growth. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Multivariate model of female black bear habitat use for a Geographic Information System

    USGS Publications Warehouse

    Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.

    1993-01-01

    Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.

  9. Pasta Nucleosynthesis: Molecular dynamics simulations of nuclear statistical equilibrium

    NASA Astrophysics Data System (ADS)

    Caplan, Matthew; Horowitz, Charles; da Silva Schneider, Andre; Berry, Donald

    2014-09-01

    We simulate the decompression of cold dense nuclear matter, near the nuclear saturation density, in order to study the role of nuclear pasta in r-process nucleosynthesis in neutron star mergers. Our simulations are performed using a classical molecular dynamics model with 51 200 and 409 600 nucleons, and are run on GPUs. We expand our simulation region to decompress systems from initial densities of 0.080 fm-3 down to 0.00125 fm-3. We study proton fractions of YP = 0.05, 0.10, 0.20, 0.30, and 0.40 at T = 0.5, 0.75, and 1 MeV. We calculate the composition of the resulting systems using a cluster algorithm. This composition is in good agreement with nuclear statistical equilibrium models for temperatures of 0.75 and 1 MeV. However, for proton fractions greater than YP = 0.2 at a temperature of T = 0.5 MeV, the MD simulations produce non-equilibrium results with large rod-like nuclei. Our MD model is valid at higher densities than simple nuclear statistical equilibrium models and may help determine the initial temperatures and proton fractions of matter ejected in mergers.

  10. Linking Mechanics and Statistics in Epidermal Tissues

    NASA Astrophysics Data System (ADS)

    Kim, Sangwoo; Hilgenfeldt, Sascha

    2015-03-01

    Disordered cellular structures, such as foams, polycrystals, or living tissues, can be characterized by quantitative measurements of domain size and topology. In recent work, we showed that correlations between size and topology in 2D systems are sensitive to the shape (eccentricity) of the individual domains: From a local model of neighbor relations, we derived an analytical justification for the famous empirical Lewis law, confirming the theory with experimental data from cucumber epidermal tissue. Here, we go beyond this purely geometrical model and identify mechanical properties of the tissue as the root cause for the domain eccentricity and thus the statistics of tissue structure. The simple model approach is based on the minimization of an interfacial energy functional. Simulations with Surface Evolver show that the domain statistics depend on a single mechanical parameter, while parameter fluctuations from cell to cell play an important role in simultaneously explaining the shape distribution of cells. The simulations are in excellent agreement with experiments and analytical theory, and establish a general link between the mechanical properties of a tissue and its structure. The model is relevant to diagnostic applications in a variety of animal and plant tissues.

  11. The statistical overlap theory of chromatography using power law (fractal) statistics.

    PubMed

    Schure, Mark R; Davis, Joe M

    2011-12-30

    The chromatographic dimensionality was recently proposed as a measure of retention time spacing based on a power law (fractal) distribution. Using this model, a statistical overlap theory (SOT) for chromatographic peaks is developed that estimates the number of peak maxima as a function of the chromatographic dimension, saturation and scale. Power law models exhibit a threshold region whereby below a critical saturation value no loss of peak maxima due to peak fusion occurs as saturation increases. At moderate saturation, behavior is similar to the random (Poisson) peak model. At still higher saturation, the power law model shows loss of peaks nearly independent of the scale and dimension of the model. The physicochemical meaning of the power law scale parameter is discussed and shown to be equal to the Boltzmann-weighted free energy of transfer over the scale limits. The scale is discussed. Small scale range (small β) is shown to generate more uniform chromatograms. Large scale range chromatograms (large β) are shown to give occasional large excursions of retention times; this is a property of power laws where "wild" behavior is noted to occasionally occur. Both cases are shown to be useful depending on the chromatographic saturation. A scale-invariant model of the SOT shows very simple relationships between the fraction of peak maxima and the saturation, peak width and number of theoretical plates. These equations provide much insight into separations which follow power law statistics. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. Role of spatial inhomogenity in GPCR dimerisation predicted by receptor association-diffusion models

    NASA Astrophysics Data System (ADS)

    Deshpande, Sneha A.; Pawar, Aiswarya B.; Dighe, Anish; Athale, Chaitanya A.; Sengupta, Durba

    2017-06-01

    G protein-coupled receptor (GPCR) association is an emerging paradigm with far reaching implications in the regulation of signalling pathways and therapeutic interventions. Recent super resolution microscopy studies have revealed that receptor dimer steady state exhibits sub-second dynamics. In particular the GPCRs, muscarinic acetylcholine receptor M1 (M1MR) and formyl peptide receptor (FPR), have been demonstrated to exhibit a fast association/dissociation kinetics, independent of ligand binding. In this work, we have developed a spatial kinetic Monte Carlo model to investigate receptor homo-dimerisation at a single receptor resolution. Experimentally measured association/dissociation kinetic parameters and diffusion coefficients were used as inputs to the model. To test the effect of membrane spatial heterogeneity on the simulated steady state, simulations were compared to experimental statistics of dimerisation. In the simplest case the receptors are assumed to be diffusing in a spatially homogeneous environment, while spatial heterogeneity is modelled to result from crowding, membrane micro-domains and cytoskeletal compartmentalisation or ‘corrals’. We show that a simple association-diffusion model is sufficient to reproduce M1MR association statistics, but fails to reproduce FPR statistics despite comparable kinetic constants. A parameter sensitivity analysis is required to reproduce the association statistics of FPR. The model reveals the complex interplay between cytoskeletal components and their influence on receptor association kinetics within the features of the membrane landscape. These results constitute an important step towards understanding the factors modulating GPCR organisation.

  13. Beyond δ : Tailoring marked statistics to reveal modified gravity

    NASA Astrophysics Data System (ADS)

    Valogiannis, Georgios; Bean, Rachel

    2018-01-01

    Models that seek to explain cosmic acceleration through modifications to general relativity (GR) evade stringent Solar System constraints through a restoring, screening mechanism. Down-weighting the high-density, screened regions in favor of the low density, unscreened ones offers the potential to enhance the amount of information carried in such modified gravity models. In this work, we assess the performance of a new "marked" transformation and perform a systematic comparison with the clipping and logarithmic transformations, in the context of Λ CDM and the symmetron and f (R ) modified gravity models. Performance is measured in terms of the fractional boost in the Fisher information and the signal-to-noise ratio (SNR) for these models relative to the statistics derived from the standard density distribution. We find that all three statistics provide improved Fisher boosts over the basic density statistics. The model parameters for the marked and clipped transformation that best enhance signals and the Fisher boosts are determined. We also show that the mark is useful both as a Fourier and real-space transformation; a marked correlation function also enhances the SNR relative to the standard correlation function, and can on mildly nonlinear scales show a significant difference between the Λ CDM and the modified gravity models. Our results demonstrate how a series of simple analytical transformations could dramatically increase the predicted information extracted on deviations from GR, from large-scale surveys, and give the prospect for a much more feasible potential detection.

  14. SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit

    PubMed Central

    Chu, Annie; Cui, Jenny; Dinov, Ivo D.

    2011-01-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994

  15. Statistical approaches to account for missing values in accelerometer data: Applications to modeling physical activity.

    PubMed

    Yue Xu, Selene; Nelson, Sandahl; Kerr, Jacqueline; Godbole, Suneeta; Patterson, Ruth; Merchant, Gina; Abramson, Ian; Staudenmayer, John; Natarajan, Loki

    2018-04-01

    Physical inactivity is a recognized risk factor for many chronic diseases. Accelerometers are increasingly used as an objective means to measure daily physical activity. One challenge in using these devices is missing data due to device nonwear. We used a well-characterized cohort of 333 overweight postmenopausal breast cancer survivors to examine missing data patterns of accelerometer outputs over the day. Based on these observed missingness patterns, we created psuedo-simulated datasets with realistic missing data patterns. We developed statistical methods to design imputation and variance weighting algorithms to account for missing data effects when fitting regression models. Bias and precision of each method were evaluated and compared. Our results indicated that not accounting for missing data in the analysis yielded unstable estimates in the regression analysis. Incorporating variance weights and/or subject-level imputation improved precision by >50%, compared to ignoring missing data. We recommend that these simple easy-to-implement statistical tools be used to improve analysis of accelerometer data.

  16. A study of two statistical methods as applied to shuttle solid rocket booster expenditures

    NASA Technical Reports Server (NTRS)

    Perlmutter, M.; Huang, Y.; Graves, M.

    1974-01-01

    The state probability technique and the Monte Carlo technique are applied to finding shuttle solid rocket booster expenditure statistics. For a given attrition rate per launch, the probable number of boosters needed for a given mission of 440 launches is calculated. Several cases are considered, including the elimination of the booster after a maximum of 20 consecutive launches. Also considered is the case where the booster is composed of replaceable components with independent attrition rates. A simple cost analysis is carried out to indicate the number of boosters to build initially, depending on booster costs. Two statistical methods were applied in the analysis: (1) state probability method which consists of defining an appropriate state space for the outcome of the random trials, and (2) model simulation method or the Monte Carlo technique. It was found that the model simulation method was easier to formulate while the state probability method required less computing time and was more accurate.

  17. Continuous distribution of emission states from single CdSe/ZnS quantum dots.

    PubMed

    Zhang, Kai; Chang, Hauyee; Fu, Aihua; Alivisatos, A Paul; Yang, Haw

    2006-04-01

    The photoluminescence dynamics of colloidal CdSe/ZnS/streptavidin quantum dots were studied using time-resolved single-molecule spectroscopy. Statistical tests of the photon-counting data suggested that the simple "on/off" discrete state model is inconsistent with experimental results. Instead, a continuous emission state distribution model was found to be more appropriate. Autocorrelation analysis of lifetime and intensity fluctuations showed a nonlinear correlation between them. These results were consistent with the model that charged quantum dots were also emissive, and that time-dependent charge migration gave rise to the observed photoluminescence dynamics.

  18. Collisional-radiative switching - A powerful technique for converging non-LTE calculations

    NASA Technical Reports Server (NTRS)

    Hummer, D. G.; Voels, S. A.

    1988-01-01

    A very simple technique has been developed to converge statistical equilibrium and model atmospheric calculations in extreme non-LTE conditions when the usual iterative methods fail to converge from an LTE starting model. The proposed technique is based on a smooth transition from a collision-dominated LTE situation to the desired non-LTE conditions in which radiation dominates, at least in the most important transitions. The proposed approach was used to successfully compute stellar models with He abundances of 0.20, 0.30, and 0.50; Teff = 30,000 K, and log g = 2.9.

  19. On some stochastic formulations and related statistical moments of pharmacokinetic models.

    PubMed

    Matis, J H; Wehrly, T E; Metzler, C M

    1983-02-01

    This paper presents the deterministic and stochastic model for a linear compartment system with constant coefficients, and it develops expressions for the mean residence times (MRT) and the variances of the residence times (VRT) for the stochastic model. The expressions are relatively simple computationally, involving primarily matrix inversion, and they are elegant mathematically, in avoiding eigenvalue analysis and the complex domain. The MRT and VRT provide a set of new meaningful response measures for pharmacokinetic analysis and they give added insight into the system kinetics. The new analysis is illustrated with an example involving the cholesterol turnover in rats.

  20. What You Learn is What You See: Using Eye Movements to Study Infant Cross-Situational Word Learning

    PubMed Central

    Smith, Linda

    2016-01-01

    Recent studies show that both adults and young children possess powerful statistical learning capabilities to solve the word-to-world mapping problem. However, the underlying mechanisms that make statistical learning possible and powerful are not yet known. With the goal of providing new insights into this issue, the research reported in this paper used an eye tracker to record the moment-by-moment eye movement data of 14-month-old babies in statistical learning tasks. Various measures are applied to such fine-grained temporal data, such as looking duration and shift rate (the number of shifts in gaze from one visual object to the other) trial by trial, showing different eye movement patterns between strong and weak statistical learners. Moreover, an information-theoretic measure is developed and applied to gaze data to quantify the degree of learning uncertainty trial by trial. Next, a simple associative statistical learning model is applied to eye movement data and these simulation results are compared with empirical results from young children, showing strong correlations between these two. This suggests that an associative learning mechanism with selective attention can provide a cognitively plausible model of cross-situational statistical learning. The work represents the first steps to use eye movement data to infer underlying real-time processes in statistical word learning. PMID:22213894

  1. Expected Monotonicity – A Desirable Property for Evidence Measures?

    PubMed Central

    Hodge, Susan E.; Vieland, Veronica J.

    2010-01-01

    We consider here the principle of ‘evidential consistency’ – that as one gathers more data, any well-behaved evidence measure should, in some sense, approach the true answer. Evidential consistency is essential for the genome-scan design (GWAS or linkage), where one selects the most promising locus(i) for follow-up, expecting that new data will increase evidence for the correct hypothesis. Earlier work [Vieland, Hum Hered 2006;61:144–156] showed that many popular statistics do not satisfy this principle; Vieland concluded that the problem stems from fundamental difficulties in how we measure evidence and argued for determining criteria to evaluate evidence measures. Here, we investigate in detail one proposed consistency criterion – expected monotonicity (ExpM) – for a simple statistical model (binomial) and four likelihood ratio (LR)-based evidence measures. We show that, with one limited exception, none of these measures displays ExpM; what they do display is sometimes counterintuitive. We conclude that ExpM is not a reasonable requirement for evidence measures; moreover, no requirement based on expected values seems feasible. We demonstrate certain desirable properties of the simple LR and demonstrate a connection between the simple and integrated LRs. We also consider an alternative version of consistency, which is satisfied by certain forms of the integrated LR and posterior probability of linkage. PMID:20664208

  2. The distribution of density in supersonic turbulence

    NASA Astrophysics Data System (ADS)

    Squire, Jonathan; Hopkins, Philip F.

    2017-11-01

    We propose a model for the statistics of the mass density in supersonic turbulence, which plays a crucial role in star formation and the physics of the interstellar medium (ISM). The model is derived by considering the density to be arranged as a collection of strong shocks of width ˜ M^{-2}, where M is the turbulent Mach number. With two physically motivated parameters, the model predicts all density statistics for M>1 turbulence: the density probability distribution and its intermittency (deviation from lognormality), the density variance-Mach number relation, power spectra and structure functions. For the proposed model parameters, reasonable agreement is seen between model predictions and numerical simulations, albeit within the large uncertainties associated with current simulation results. More generally, the model could provide a useful framework for more detailed analysis of future simulations and observational data. Due to the simple physical motivations for the model in terms of shocks, it is straightforward to generalize to more complex physical processes, which will be helpful in future more detailed applications to the ISM. We see good qualitative agreement between such extensions and recent simulations of non-isothermal turbulence.

  3. Effective model approach to the dense state of QCD matter

    NASA Astrophysics Data System (ADS)

    Fukushima, Kenji

    2011-12-01

    The first-principle approach to the dense state of QCD matter, i.e. the lattice-QCD simulation at finite baryon density, is not under theoretical control for the moment. The effective model study based on QCD symmetries is a practical alternative. However the model parameters that are fixed by hadronic properties in the vacuum may have unknown dependence on the baryon chemical potential. We propose a new prescription to constrain the effective model parameters by the matching condition with the thermal Statistical Model. In the transitional region where thermal quantities blow up in the Statistical Model, deconfined quarks and gluons should smoothly take over the relevant degrees of freedom from hadrons and resonances. We use the Polyakov-loop coupled Nambu-Jona-Lasinio (PNJL) model as an effective description in the quark side and show how the matching condition is satisfied by a simple ansäatz on the Polyakov loop potential. Our results favor a phase diagram with the chiral phase transition located at slightly higher temperature than deconfinement which stays close to the chemical freeze-out points.

  4. A Statistical Physicist's Approach to Biological Motion: From the the Kinesin Walk to Muscle Contraction

    NASA Astrophysics Data System (ADS)

    Vicsek, Tamas

    1997-03-01

    It is demonstrated that a wide range of experimental results on biological motion can be successfully interpreted in terms of statistical physics motivated models taking into account the relevant microscopic details of motor proteins and allowing analytic solutions. Two important examples are considered, i) the motion of a single kinesin molecule along microtubules inside individual cells and ii) muscle contraction which is a macroscopic phenomenon due to the collective action of a large number of myosin heads along actin filaments. i) Recently individual two-headed kinesin molecules have been studied in in vitro motility assays revealing a number of their peculiar transport properties. Here we propose a simple and robust model for the kinesin stepping process with elastically coupled Brownian heads showing all of these properties. The analytic treatment of our model results in a very good fit to the experimental data and practically has no free parameters. ii) Myosin is an ATPase enzyme that converts the chemical energy stored in ATP molecules into mechanical work. During muscle contraction, the myosin cross-bridges attach to the actin filaments and exert force on them yielding a relative sliding of the actin and myosin filaments. In this paper we present a simple mechanochemical model for the cross-bridge interaction involving the relevant kinetic data and providing simple analytic solutions for the mechanical properties of muscle contraction, such as the force-velocity relationship or the relative number of the attached cross-bridges. So far the only analytic formula which could be fitted to the measured force-velocity curves has been the well known Hill equation containing parameters lacking clear microscopic origin. The main advantages of our new approach are that it explicitly connects the mechanical data with the kinetic data and the concentration of the ATP and ATPase products and as such it leads to new analytic solutions which agree extremely well with a wide range of experimental curves, while the parameters of the corresponding expressions have well defined microscopic meaning.

  5. [Is there life beyond SPSS? Discover R].

    PubMed

    Elosua Oliden, Paula

    2009-11-01

    R is a GNU statistical and programming environment with very high graphical capabilities. It is very powerful for research purposes, but it is also an exceptional tool for teaching. R is composed of more than 1400 packages that allow using it for simple statistics and applying the most complex and most recent formal models. Using graphical interfaces like the Rcommander package, permits working in user-friendly environments which are similar to the graphical environment used by SPSS. This last characteristic allows non-statisticians to overcome the obstacle of accessibility, and it makes R the best tool for teaching. Is there anything better? Open, free, affordable, accessible and always on the cutting edge.

  6. Statistical mechanics of broadcast channels using low-density parity-check codes.

    PubMed

    Nakamura, Kazutaka; Kabashima, Yoshiyuki; Morelos-Zaragoza, Robert; Saad, David

    2003-03-01

    We investigate the use of Gallager's low-density parity-check (LDPC) codes in a degraded broadcast channel, one of the fundamental models in network information theory. Combining linear codes is a standard technique in practical network communication schemes and is known to provide better performance than simple time sharing methods when algebraic codes are used. The statistical physics based analysis shows that the practical performance of the suggested method, achieved by employing the belief propagation algorithm, is superior to that of LDPC based time sharing codes while the best performance, when received transmissions are optimally decoded, is bounded by the time sharing limit.

  7. Common inputs in subthreshold membrane potential: The role of quiescent states in neuronal activity

    NASA Astrophysics Data System (ADS)

    Montangie, Lisandro; Montani, Fernando

    2018-06-01

    Experiments in certain regions of the cerebral cortex suggest that the spiking activity of neuronal populations is regulated by common non-Gaussian inputs across neurons. We model these deviations from random-walk processes with q -Gaussian distributions into simple threshold neurons, and investigate the scaling properties in large neural populations. We show that deviations from the Gaussian statistics provide a natural framework to regulate population statistics such as sparsity, entropy, and specific heat. This type of description allows us to provide an adequate strategy to explain the information encoding in the case of low neuronal activity and its possible implications on information transmission.

  8. On the statistical distribution in a deformed solid

    NASA Astrophysics Data System (ADS)

    Gorobei, N. N.; Luk'yanenko, A. S.

    2017-09-01

    A modification of the Gibbs distribution in a thermally insulated mechanically deformed solid, where its linear dimensions (shape parameters) are excluded from statistical averaging and included among the macroscopic parameters of state alongside with the temperature, is proposed. Formally, this modification is reduced to corresponding additional conditions when calculating the statistical sum. The shape parameters and the temperature themselves are found from the conditions of mechanical and thermal equilibria of a body, and their change is determined using the first law of thermodynamics. Known thermodynamic phenomena are analyzed for the simple model of a solid, i.e., an ensemble of anharmonic oscillators, within the proposed formalism with an accuracy of up to the first order by the anharmonicity constant. The distribution modification is considered for the classic and quantum temperature regions apart.

  9. Mixed-order phase transition in a minimal, diffusion-based spin model.

    PubMed

    Fronczak, Agata; Fronczak, Piotr

    2016-07-01

    In this paper we exactly solve, within the grand canonical ensemble, a minimal spin model with the hybrid phase transition. We call the model diffusion based because its Hamiltonian can be recovered from a simple dynamic procedure, which can be seen as an equilibrium statistical mechanics representation of a biased random walk. We outline the derivation of the phase diagram of the model, in which the triple point has the hallmarks of the hybrid transition: discontinuity in the average magnetization and algebraically diverging susceptibilities. At this point, two second-order transition curves meet in equilibrium with the first-order curve, resulting in a prototypical mixed-order behavior.

  10. QSPR using MOLGEN-QSPR: the challenge of fluoroalkane boiling points.

    PubMed

    Rücker, Christoph; Meringer, Markus; Kerber, Adalbert

    2005-01-01

    By means of the new software MOLGEN-QSPR, a multilinear regression model for the boiling points of lower fluoroalkanes is established. The model is based exclusively on simple descriptors derived directly from molecular structure and nevertheless describes a broader set of data more precisely than previous attempts that used either more demanding (quantum chemical) descriptors or more demanding (nonlinear) statistical methods such as neural networks. The model's internal consistency was confirmed by leave-one-out cross-validation. The model was used to predict all unknown boiling points of fluorobutanes, and the quality of predictions was estimated by means of comparison with boiling point predictions for fluoropentanes.

  11. Predicting the stability of nanodevices

    NASA Astrophysics Data System (ADS)

    Lin, Z. Z.; Yu, W. F.; Wang, Y.; Ning, X. J.

    2011-05-01

    A simple model based on the statistics of single atoms is developed to predict the stability or lifetime of nanodevices without empirical parameters. Under certain conditions, the model produces the Arrhenius law and the Meyer-Neldel compensation rule. Compared with the classical molecular-dynamics simulations for predicting the stability of monatomic carbon chain at high temperature, the model is proved to be much more accurate than the transition state theory. Based on the ab initio calculation of the static potential, the model can give out a corrected lifetime of monatomic carbon and gold chains at higher temperature, and predict that the monatomic chains are very stable at room temperature.

  12. Prediction of rain effects on earth-space communication links operating in the 10 to 35 GHz frequency range

    NASA Technical Reports Server (NTRS)

    Stutzman, Warren L.

    1989-01-01

    This paper reviews the effects of precipitation on earth-space communication links operating the 10 to 35 GHz frequency range. Emphasis is on the quantitative prediction of rain attenuation and depolarization. Discussions center on the models developed at Virginia Tech. Comments on other models are included as well as literature references to key works. Also included is the system level modeling for dual polarized communication systems with techniques for calculating antenna and propagation medium effects. Simple models for the calculation of average annual attenuation and cross-polarization discrimination (XPD) are presented. Calculation of worst month statistics are also presented.

  13. [Comparative study of the repair of full thickness tear of the supraspinatus by means of "single row" or "suture bridge" techniques].

    PubMed

    Arroyo-Hernández, M; Mellado-Romero, M A; Páramo-Díaz, P; Martín-López, C M; Cano-Egea, J M; Vilá Y Rico, J

    2015-01-01

    The purpose of this study is to analyze if there is any difference between the arthroscopic reparation of full-thickness supraspinatus tears with simple row technique versus suture bridge technique. We accomplished a retrospective study of 123 patients with full-thickness supraspinatus tears between January 2009 and January 2013 in our hospital. There were 60 simple row reparations, and 63 suture bridge ones. The mean age in the simple row group was 62.9, and in the suture bridge group was 63.3 years old. There were more women than men in both groups (67%). All patients were studied using the Constant test. The mean Constant test in the suture bridge group was 76.7, and in the simple row group was 72.4. We have also accomplished a statistical analysis of each Constant item. Strength was higher in the suture bridge group, with a significant statistical difference (p 0.04). The range of movement was also greater in the suture bridge group, but was not statistically significant. Suture bridge technique has better clinical results than single row reparations, but the difference is not statistically significant (p = 0.298).

  14. Statistical Inference of a RANS closure for a Jet-in-Crossflow simulation

    NASA Astrophysics Data System (ADS)

    Heyse, Jan; Edeling, Wouter; Iaccarino, Gianluca

    2016-11-01

    The jet-in-crossflow is found in several engineering applications, such as discrete film cooling for turbine blades, where a coolant injected through hols in the blade's surface protects the component from the hot gases leaving the combustion chamber. Experimental measurements using MRI techniques have been completed for a single hole injection into a turbulent crossflow, providing full 3D averaged velocity field. For such flows of engineering interest, Reynolds-Averaged Navier-Stokes (RANS) turbulence closure models are often the only viable computational option. However, RANS models are known to provide poor predictions in the region close to the injection point. Since these models are calibrated on simple canonical flow problems, the obtained closure coefficient estimates are unlikely to extrapolate well to more complex flows. We will therefore calibrate the parameters of a RANS model using statistical inference techniques informed by the experimental jet-in-crossflow data. The obtained probabilistic parameter estimates can in turn be used to compute flow fields with quantified uncertainty. Stanford Graduate Fellowship in Science and Engineering.

  15. A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation

    PubMed Central

    Eddy, Sean R.

    2008-01-01

    Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236

  16. The Cosmological Dependence of Galaxy Cluster Morphologies

    NASA Astrophysics Data System (ADS)

    Crone, Mary Margaret

    1995-01-01

    Measuring the density of the universe has been a fundamental problem in cosmology ever since the "Big Bang" model was developed over sixty years ago. In this simple and successful model, the age and eventual fate of the universe are determined by its density, its rate of expansion, and the value of a universal "cosmological constant". Analytic models suggest that many properties of galaxy clusters are sensitive to cosmological parameters. In this thesis, I use N-body simulations to examine cluster density profiles, abundances, and degree of subclustering to test the feasibility of using them as cosmological tests. The dependence on both cosmology and initial density field is examined, using a grid of cosmologies and scale-free initial power spectra P(k)~ k n. Einstein-deSitter ( Omegao=1), open ( Omegao=0.2 and 0.1) and flat, low density (Omegao=0.2, lambdao=0.8) models are studied, with initial spectral indices n=-2, -1 and 0. Of particular interest are the results for cluster profiles and substructure. The average density profiles are well fit by a power law p(r)~ r ^{-alpha} for radii where the local density contrast is between 100 and 3000. There is a clear trend toward steeper slopes with both increasing n and decreasing Omegao, with profile slopes in the open models consistently higher than Omega=1 values for the range of n examined. The amount of substructure in each model is quantified and explained in terms of cluster merger histories and the behavior of substructure statistics. The statistic which best distinguishes models is a very simple measure of deviations from symmetry in the projected mass distribution --the "Center-of-Mass Shift" as a function of overdensity. Some statistics which are quite sensitive to substructure perform relatively poorly as cosmological indicators. Density profiles and the Center-of-Mass test are both well-suited for comparison with weak lensing data and galaxy distributions. Such data are currently being collected and should be available within the next few years. At that time the predictions described here can be used to set useful cosmological constraints.

  17. Statistical Modelling of Temperature and Moisture Uptake of Biochars Exposed to Selected Relative Humidity of Air.

    PubMed

    Bastistella, Luciane; Rousset, Patrick; Aviz, Antonio; Caldeira-Pires, Armando; Humbert, Gilles; Nogueira, Manoel

    2018-02-09

    New experimental techniques, as well as modern variants on known methods, have recently been employed to investigate the fundamental reactions underlying the oxidation of biochar. The purpose of this paper was to experimentally and statistically study how the relative humidity of air, mass, and particle size of four biochars influenced the adsorption of water and the increase in temperature. A random factorial design was employed using the intuitive statistical software Xlstat. A simple linear regression model and an analysis of variance with a pairwise comparison were performed. The experimental study was carried out on the wood of Quercus pubescens , Cyclobalanopsis glauca , Trigonostemon huangmosun , and Bambusa vulgaris , and involved five relative humidity conditions (22, 43, 75, 84, and 90%), two mass samples (0.1 and 1 g), and two particle sizes (powder and piece). Two response variables including water adsorption and temperature increase were analyzed and discussed. The temperature did not increase linearly with the adsorption of water. Temperature was modeled by nine explanatory variables, while water adsorption was modeled by eight. Five variables, including factors and their interactions, were found to be common to the two models. Sample mass and relative humidity influenced the two qualitative variables, while particle size and biochar type only influenced the temperature.

  18. The Mantel-Haenszel procedure revisited: models and generalizations.

    PubMed

    Fidler, Vaclav; Nagelkerke, Nico

    2013-01-01

    Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented.

  19. The Mantel-Haenszel Procedure Revisited: Models and Generalizations

    PubMed Central

    Fidler, Vaclav; Nagelkerke, Nico

    2013-01-01

    Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented. PMID:23516463

  20. A statistical approach to quasi-extinction forecasting.

    PubMed

    Holmes, Elizabeth Eli; Sabo, John L; Viscido, Steven Vincent; Fagan, William Fredric

    2007-12-01

    Forecasting population decline to a certain critical threshold (the quasi-extinction risk) is one of the central objectives of population viability analysis (PVA), and such predictions figure prominently in the decisions of major conservation organizations. In this paper, we argue that accurate forecasting of a population's quasi-extinction risk does not necessarily require knowledge of the underlying biological mechanisms. Because of the stochastic and multiplicative nature of population growth, the ensemble behaviour of population trajectories converges to common statistical forms across a wide variety of stochastic population processes. This paper provides a theoretical basis for this argument. We show that the quasi-extinction surfaces of a variety of complex stochastic population processes (including age-structured, density-dependent and spatially structured populations) can be modelled by a simple stochastic approximation: the stochastic exponential growth process overlaid with Gaussian errors. Using simulated and real data, we show that this model can be estimated with 20-30 years of data and can provide relatively unbiased quasi-extinction risk with confidence intervals considerably smaller than (0,1). This was found to be true even for simulated data derived from some of the noisiest population processes (density-dependent feedback, species interactions and strong age-structure cycling). A key advantage of statistical models is that their parameters and the uncertainty of those parameters can be estimated from time series data using standard statistical methods. In contrast for most species of conservation concern, biologically realistic models must often be specified rather than estimated because of the limited data available for all the various parameters. Biologically realistic models will always have a prominent place in PVA for evaluating specific management options which affect a single segment of a population, a single demographic rate, or different geographic areas. However, for forecasting quasi-extinction risk, statistical models that are based on the convergent statistical properties of population processes offer many advantages over biologically realistic models.

  1. Topics in Statistical Calibration

    DTIC Science & Technology

    2014-03-27

    on a parametric bootstrap where, instead of sampling directly from the residuals , samples are drawn from a normal distribution. This procedure will...addition to centering them (Davison and Hinkley, 1997). When there are outliers in the residuals , the bootstrap distribution of x̂0 can become skewed or...based and inversion methods using the linear mixed-effects model. Then, a simple parametric bootstrap algorithm is proposed that can be used to either

  2. Exploration–exploitation trade-off features a saltatory search behaviour

    PubMed Central

    Volchenkov, Dimitri; Helbach, Jonathan; Tscherepanow, Marko; Kühnel, Sina

    2013-01-01

    Searching experiments conducted in different virtual environments over a gender-balanced group of people revealed a gender irrelevant scale-free spread of searching activity on large spatio-temporal scales. We have suggested and solved analytically a simple statistical model of the coherent-noise type describing the exploration–exploitation trade-off in humans (‘should I stay’ or ‘should I go’). The model exhibits a variety of saltatory behaviours, ranging from Lévy flights occurring under uncertainty to Brownian walks performed by a treasure hunter confident of the eventual success. PMID:23782535

  3. Calculations of proton-binding thermodynamics in proteins.

    PubMed

    Beroza, P; Case, D A

    1998-01-01

    Computational models of proton binding can range from the chemically complex and statistically simple (as in the quantum calculations) to the chemically simple and statistically complex. Much progress has been made in the multiple-site titration problem. Calculations have improved with the inclusion of more flexibility in regard to both the geometry of the proton binding and the larger scale protein motions associated with titration. This article concentrated on the principles of current calculations, but did not attempt to survey their quantitative performance. This is (1) because such comparisons are given in the cited papers and (2) because continued developments in understanding conformational flexibility and interaction energies will be needed to develop robust methods with strong predictive power. Nevertheless, the advances achieved over the past few years should not be underestimated: serious calculations of protonation behavior and its coupling to conformational change can now be confidently pursued against a backdrop of increasing understanding of the strengths and limitations of such models. It is hoped that such theoretical advances will also spur renewed experimental interest in measuring both overall titration curves and individual pKa values or pKa shifts. Exploration of the shapes of individual titration curves (as measured by Hill coefficients and other parameters) would also be useful in assessing the accuracy of computations and in drawing connections to functional behavior.

  4. A review of statistical updating methods for clinical prediction models.

    PubMed

    Su, Ting-Li; Jaki, Thomas; Hickey, Graeme L; Buchan, Iain; Sperrin, Matthew

    2018-01-01

    A clinical prediction model is a tool for predicting healthcare outcomes, usually within a specific population and context. A common approach is to develop a new clinical prediction model for each population and context; however, this wastes potentially useful historical information. A better approach is to update or incorporate the existing clinical prediction models already developed for use in similar contexts or populations. In addition, clinical prediction models commonly become miscalibrated over time, and need replacing or updating. In this article, we review a range of approaches for re-using and updating clinical prediction models; these fall in into three main categories: simple coefficient updating, combining multiple previous clinical prediction models in a meta-model and dynamic updating of models. We evaluated the performance (discrimination and calibration) of the different strategies using data on mortality following cardiac surgery in the United Kingdom: We found that no single strategy performed sufficiently well to be used to the exclusion of the others. In conclusion, useful tools exist for updating existing clinical prediction models to a new population or context, and these should be implemented rather than developing a new clinical prediction model from scratch, using a breadth of complementary statistical methods.

  5. Pulsed Rabi oscillations in quantum two-level systems: beyond the area theorem

    NASA Astrophysics Data System (ADS)

    Fischer, Kevin A.; Hanschke, Lukas; Kremser, Malte; Finley, Jonathan J.; Müller, Kai; Vučković, Jelena

    2018-01-01

    The area theorem states that when a short optical pulse drives a quantum two-level system, it undergoes Rabi oscillations in the probability of scattering a single photon. In this work, we investigate the breakdown of the area theorem as both the pulse length becomes non-negligible and for certain pulse areas. Using simple quantum trajectories, we provide an analytic approximation to the photon emission dynamics of a two-level system. Our model provides an intuitive way to understand re-excitation, which elucidates the mechanism behind the two-photon emission events that can spoil single-photon emission. We experimentally measure the emission statistics from a semiconductor quantum dot, acting as a two-level system, and show good agreement with our simple model for short pulses. Additionally, the model clearly explains our recent results (Fischer and Hanschke 2017 et al Nat. Phys.) showing dominant two-photon emission from a two-level system for pulses with interaction areas equal to an even multiple of π.

  6. Simple estimation procedures for regression analysis of interval-censored failure time data under the proportional hazards model.

    PubMed

    Sun, Jianguo; Feng, Yanqin; Zhao, Hui

    2015-01-01

    Interval-censored failure time data occur in many fields including epidemiological and medical studies as well as financial and sociological studies, and many authors have investigated their analysis (Sun, The statistical analysis of interval-censored failure time data, 2006; Zhang, Stat Modeling 9:321-343, 2009). In particular, a number of procedures have been developed for regression analysis of interval-censored data arising from the proportional hazards model (Finkelstein, Biometrics 42:845-854, 1986; Huang, Ann Stat 24:540-568, 1996; Pan, Biometrics 56:199-203, 2000). For most of these procedures, however, one drawback is that they involve estimation of both regression parameters and baseline cumulative hazard function. In this paper, we propose two simple estimation approaches that do not need estimation of the baseline cumulative hazard function. The asymptotic properties of the resulting estimates are given, and an extensive simulation study is conducted and indicates that they work well for practical situations.

  7. Colour and luminance contrasts predict the human detection of natural stimuli in complex visual environments.

    PubMed

    White, Thomas E; Rojas, Bibiana; Mappes, Johanna; Rautiala, Petri; Kemp, Darrell J

    2017-09-01

    Much of what we know about human colour perception has come from psychophysical studies conducted in tightly-controlled laboratory settings. An enduring challenge, however, lies in extrapolating this knowledge to the noisy conditions that characterize our actual visual experience. Here we combine statistical models of visual perception with empirical data to explore how chromatic (hue/saturation) and achromatic (luminant) information underpins the detection and classification of stimuli in a complex forest environment. The data best support a simple linear model of stimulus detection as an additive function of both luminance and saturation contrast. The strength of each predictor is modest yet consistent across gross variation in viewing conditions, which accords with expectation based upon general primate psychophysics. Our findings implicate simple visual cues in the guidance of perception amidst natural noise, and highlight the potential for informing human vision via a fusion between psychophysical modelling and real-world behaviour. © 2017 The Author(s).

  8. Obtaining natural-like flow releases in diverted river reaches from simple riparian benefit economic models.

    PubMed

    Perona, Paolo; Dürrenmatt, David J; Characklis, Gregory W

    2013-03-30

    We propose a theoretical river modeling framework for generating variable flow patterns in diverted-streams (i.e., no reservoir). Using a simple economic model and the principle of equal marginal utility in an inverse fashion we first quantify the benefit of the water that goes to the environment in relation to that of the anthropic activity. Then, we obtain exact expressions for optimal water allocation rules between the two competing uses, as well as the related statistical distributions. These rules are applied using both synthetic and observed streamflow data, to demonstrate that this approach may be useful in 1) generating more natural flow patterns in the river reach downstream of the diversion, thus reducing the ecodeficit; 2) obtaining a more enlightened economic interpretation of Minimum Flow Release (MFR) strategies, and; 3) comparing the long-term costs and benefits of variable versus MFR policies and showing the greater ecological sustainability of this new approach. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. A simple rapid approach using coupled multivariate statistical methods, GIS and trajectory models to delineate areas of common oil spill risk

    NASA Astrophysics Data System (ADS)

    Guillen, George; Rainey, Gail; Morin, Michelle

    2004-04-01

    Currently, the Minerals Management Service uses the Oil Spill Risk Analysis model (OSRAM) to predict the movement of potential oil spills greater than 1000 bbl originating from offshore oil and gas facilities. OSRAM generates oil spill trajectories using meteorological and hydrological data input from either actual physical measurements or estimates generated from other hydrological models. OSRAM and many other models produce output matrices of average, maximum and minimum contact probabilities to specific landfall or target segments (columns) from oil spills at specific points (rows). Analysts and managers are often interested in identifying geographic areas or groups of facilities that pose similar risks to specific targets or groups of targets if a spill occurred. Unfortunately, due to the potentially large matrix generated by many spill models, this question is difficult to answer without the use of data reduction and visualization methods. In our study we utilized a multivariate statistical method called cluster analysis to group areas of similar risk based on potential distribution of landfall target trajectory probabilities. We also utilized ArcView™ GIS to display spill launch point groupings. The combination of GIS and multivariate statistical techniques in the post-processing of trajectory model output is a powerful tool for identifying and delineating areas of similar risk from multiple spill sources. We strongly encourage modelers, statistical and GIS software programmers to closely collaborate to produce a more seamless integration of these technologies and approaches to analyzing data. They are complimentary methods that strengthen the overall assessment of spill risks.

  10. Identifying mechanisms for superdiffusive dynamics in cell trajectories

    NASA Astrophysics Data System (ADS)

    Passucci, Giuseppe; Brasch, Megan; Henderson, James; Manning, M. Lisa

    Self-propelled particle (SPP) models have been used to explore features of active matter such as motility-induced phase separation, jamming, and flocking, and are often used to model biological cells. However, many cells exhibit super-diffusive trajectories, where displacements scale faster than t 1 / 2 in all directions, and these are not captured by traditional SPP models. We extract cell trajectories from image stacks of mouse fibroblast cells moving on 2D substrates and find super-diffusive mean-squared displacements in all directions across varying densities. Two SPP model modifications have been proposed to capture super-diffusive dynamics: Levy walks and heterogeneous motility parameters. In mouse fibroblast cells displacement probability distributions collapse when time is rescaled by a power greater than 1/2, which is consistent with Levy walks. We show that a simple SPP model with heterogeneous rotational noise can also generate a similar collapse. Furthermore, a close examination of statistics extracted directly from cell trajectories is consistent with a heterogeneous mobility SPP model and inconsistent with a Levy walk model. Our work demonstrates that a simple set of analyses can distinguish between mechanisms for anomalous diffusion in active matter.

  11. A simple model for calculating air pollution within street canyons

    NASA Astrophysics Data System (ADS)

    Venegas, Laura E.; Mazzeo, Nicolás A.; Dezzutti, Mariana C.

    2014-04-01

    This paper introduces the Semi-Empirical Urban Street (SEUS) model. SEUS is a simple mathematical model based on the scaling of air pollution concentration inside street canyons employing the emission rate, the width of the canyon, the dispersive velocity scale and the background concentration. Dispersive velocity scale depends on turbulent motions related to wind and traffic. The parameterisations of these turbulent motions include two dimensionless empirical parameters. Functional forms of these parameters have been obtained from full scale data measured in street canyons at four European cities. The sensitivity of SEUS model is studied analytically. Results show that relative errors in the evaluation of the two dimensionless empirical parameters have less influence on model uncertainties than uncertainties in other input variables. The model estimates NO2 concentrations using a simple photochemistry scheme. SEUS is applied to estimate NOx and NO2 hourly concentrations in an irregular and busy street canyon in the city of Buenos Aires. The statistical evaluation of results shows that there is a good agreement between estimated and observed hourly concentrations (e.g. fractional bias are -10.3% for NOx and +7.8% for NO2). The agreement between the estimated and observed values has also been analysed in terms of its dependence on wind speed and direction. The model shows a better performance for wind speeds >2 m s-1 than for lower wind speeds and for leeward situations than for others. No significant discrepancies have been found between the results of the proposed model and that of a widely used operational dispersion model (OSPM), both using the same input information.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, C.; Potts, I.; Reeks, M. W., E-mail: mike.reeks@ncl.ac.uk

    We present a simple stochastic quadrant model for calculating the transport and deposition of heavy particles in a fully developed turbulent boundary layer based on the statistics of wall-normal fluid velocity fluctuations obtained from a fully developed channel flow. Individual particles are tracked through the boundary layer via their interactions with a succession of random eddies found in each of the quadrants of the fluid Reynolds shear stress domain in a homogeneous Markov chain process. In this way, we are able to account directly for the influence of ejection and sweeping events as others have done but without resorting tomore » the use of adjustable parameters. Deposition rate predictions for a wide range of heavy particles predicted by the model compare well with benchmark experimental measurements. In addition, deposition rates are compared with those obtained from continuous random walk models and Langevin equation based ejection and sweep models which noticeably give significantly lower deposition rates. Various statistics related to the particle near wall behavior are also presented. Finally, we consider the model limitations in using the model to calculate deposition in more complex flows where the near wall turbulence may be significantly different.« less

  13. An Intercomparison of the Dynamical Cores of Global Atmospheric Circulation Models for Mars

    NASA Technical Reports Server (NTRS)

    Hollingsworth, Jeffery L.; Bridger, Alison F. C.; Haberle, Robert M.

    1998-01-01

    This is a Final Report for a Joint Research Interchange (JRI) between NASA Ames Research Center and San Jose State University, Department of Meteorology. The focus of this JRI has been to evaluate the dynamical 'cores' of two global atmospheric circulation models for Mars that are in operation at the NASA Ames Research Center. The two global circulation models in use are fundamentally different: one uses spherical harmonics in its horizontal representation of field variables; the other uses finite differences on a uniform longitude-latitude grid. Several simulations have been conducted to assess how the dynamical processors of each of these circulation models perform using identical 'simple physics' parameterizations. A variety of climate statistics (e.g., time-mean flows and eddy fields) have been compared for realistic solstitial mean basic states. Results of this research have demonstrated that the two Mars circulation models with completely different spatial representations and discretizations produce rather similar circulation statistics for first-order meteorological fields, suggestive of a tendency for convergence of numerical solutions. Second and higher-order fields can, however, vary significantly between the two models.

  14. An Intercomparison of the Dynamical Cores of Global Atmospheric Circulation Models for Mars

    NASA Technical Reports Server (NTRS)

    Hollingsworth, Jeffery L.; Bridger, Alison F. C.; Haberle, Robert M.

    1998-01-01

    This is a Final Report for a Joint Research Interchange (JRI) between NASA Ames Research Cen- ter and San Jose State University, Department of Meteorology. The focus of this JRI has been to evaluate the dynamical "cores" of two global atmospheric circulation models for Mars that are in operation at the NASA Ames Research Center. ne two global circulation models in use are fundamentally different: one uses spherical harmonics in its horizontal representation of field variables; the other uses finite differences on a uniform longitude-latitude grid. Several simulations have been conducted to assess how the dynamical processors of each of these circulation models perform using identical "simple physics" parameterizations. A variety of climate statistics (e.g., time-mean flows and eddy fields) have been compared for realistic solstitial mean basic states. Results of this research have demonstrated that the two Mars circulation models with completely different spatial representations and discretizations produce rather similar circulation statistics for first-order meteorological fields, suggestive of a tendency for convergence of numerical solutions. Second and higher-order fields can, however, vary significantly between the two models.

  15. Statistical mechanical models for dissociative adsorption of O2 on metal(100) surfaces with blocking, steering, and funneling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Evans, James W.; Liu, Da-Jiang

    We develop statistical mechanical models amenable to analytic treatment for the dissociative adsorption of O2 at hollow sites on fcc(100) metal surfaces. The models incorporate exclusion of nearest-neighbor pairs of adsorbed O. However, corresponding simple site-blocking models, where adsorption requires a large ensemble of available sites, exhibit an anomalously fast initial decrease in sticking. Thus, in addition to blocking, our models also incorporate more facile adsorption via orientational steering and funneling dynamics (features supported by ab initio Molecular Dynamics studies). Behavior for equilibrated adlayers is distinct from those with finite adspecies mobility. We focus on the low-temperature limited-mobility regime wheremore » analysis of the associated master equations readily produces exact results for both short- and long-time behavior. Kinetic Monte Carlo simulation is also utilized to provide a more complete picture of behavior. These models capture both the initial decrease and the saturation of the experimentally observed sticking versus coverage, as well as features of non-equilibrium adlayer ordering as assessed by surface-sensitive diffraction.« less

  16. Statistical mechanical models for dissociative adsorption of O{sub 2} on metal(100) surfaces with blocking, steering, and funneling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Evans, James W.; Department of Physics and Astronomy, Iowa State University, Ames, Iowa 50011; Liu, Da-Jiang

    We develop statistical mechanical models amenable to analytic treatment for the dissociative adsorption of O{sub 2} at hollow sites on fcc(100) metal surfaces. The models incorporate exclusion of nearest-neighbor pairs of adsorbed O. However, corresponding simple site-blocking models, where adsorption requires a large ensemble of available sites, exhibit an anomalously fast initial decrease in sticking. Thus, in addition to blocking, our models also incorporate more facile adsorption via orientational steering and funneling dynamics (features supported by ab initio Molecular Dynamics studies). Behavior for equilibrated adlayers is distinct from those with finite adspecies mobility. We focus on the low-temperature limited-mobility regimemore » where analysis of the associated master equations readily produces exact results for both short- and long-time behavior. Kinetic Monte Carlo simulation is also utilized to provide a more complete picture of behavior. These models capture both the initial decrease and the saturation of the experimentally observed sticking versus coverage, as well as features of non-equilibrium adlayer ordering as assessed by surface-sensitive diffraction.« less

  17. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE PAGES

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent; ...

    2018-03-02

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  18. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  19. Statistical mechanics of homogeneous partly pinned fluid systems.

    PubMed

    Krakoviack, Vincent

    2010-12-01

    The homogeneous partly pinned fluid systems are simple models of a fluid confined in a disordered porous matrix obtained by arresting randomly chosen particles in a one-component bulk fluid or one of the two components of a binary mixture. In this paper, their configurational properties are investigated. It is shown that a peculiar complementarity exists between the mobile and immobile phases, which originates from the fact that the solid is prepared in presence of and in equilibrium with the adsorbed fluid. Simple identities follow, which connect different types of configurational averages, either relative to the fluid-matrix system or to the bulk fluid from which it is prepared. Crucial simplifications result for the computation of important structural quantities, both in computer simulations and in theoretical approaches. Finally, possible applications of the model in the field of dynamics in confinement or in strongly asymmetric mixtures are suggested.

  20. Aggregative Learning Method and Its Application for Communication Quality Evaluation

    NASA Astrophysics Data System (ADS)

    Akhmetov, Dauren F.; Kotaki, Minoru

    2007-12-01

    In this paper, so-called Aggregative Learning Method (ALM) is proposed to improve and simplify the learning and classification abilities of different data processing systems. It provides a universal basis for design and analysis of mathematical models of wide class. A procedure was elaborated for time series model reconstruction and analysis for linear and nonlinear cases. Data approximation accuracy (during learning phase) and data classification quality (during recall phase) are estimated from introduced statistic parameters. The validity and efficiency of the proposed approach have been demonstrated through its application for monitoring of wireless communication quality, namely, for Fixed Wireless Access (FWA) system. Low memory and computation resources were shown to be needed for the procedure realization, especially for data classification (recall) stage. Characterized with high computational efficiency and simple decision making procedure, the derived approaches can be useful for simple and reliable real-time surveillance and control system design.

  1. Relaxation mechanisms in glassy dynamics: the Arrhenius and fragile regimes.

    PubMed

    Hentschel, H George E; Karmakar, Smarajit; Procaccia, Itamar; Zylberg, Jacques

    2012-06-01

    Generic glass formers exhibit at least two characteristic changes in their relaxation behavior, first to an Arrhenius-type relaxation at some characteristic temperature and then at a lower characteristic temperature to a super-Arrhenius (fragile) behavior. We address these transitions by studying the statistics of free energy barriers for different systems at different temperatures and space dimensions. We present a clear evidence for changes in the dynamical behavior at the transition to Arrhenius and then to a super-Arrhenius behavior. A simple model is presented, based on the idea of competition between single-particle and cooperative dynamics. We argue that Arrhenius behavior can take place as long as there is enough free volume for the completion of a simple T1 relaxation process. Once free volume is absent one needs a cooperative mechanism to "collect" enough free volume. We show that this model captures all the qualitative behavior observed in simulations throughout the considered temperature range.

  2. Using entropy measures to characterize human locomotion.

    PubMed

    Leverick, Graham; Szturm, Tony; Wu, Christine Q

    2014-12-01

    Entropy measures have been widely used to quantify the complexity of theoretical and experimental dynamical systems. In this paper, the value of using entropy measures to characterize human locomotion is demonstrated based on their construct validity, predictive validity in a simple model of human walking and convergent validity in an experimental study. Results show that four of the five considered entropy measures increase meaningfully with the increased probability of falling in a simple passive bipedal walker model. The same four entropy measures also experienced statistically significant increases in response to increasing age and gait impairment caused by cognitive interference in an experimental study. Of the considered entropy measures, the proposed quantized dynamical entropy (QDE) and quantization-based approximation of sample entropy (QASE) offered the best combination of sensitivity to changes in gait dynamics and computational efficiency. Based on these results, entropy appears to be a viable candidate for assessing the stability of human locomotion.

  3. Incorporating User Input in Template-Based Segmentation

    PubMed Central

    Vidal, Camille; Beggs, Dale; Younes, Laurent; Jain, Sanjay K.; Jedynak, Bruno

    2015-01-01

    We present a simple and elegant method to incorporate user input in a template-based segmentation method for diseased organs. The user provides a partial segmentation of the organ of interest, which is used to guide the template towards its target. The user also highlights some elements of the background that should be excluded from the final segmentation. We derive by likelihood maximization a registration algorithm from a simple statistical image model in which the user labels are modeled as Bernoulli random variables. The resulting registration algorithm minimizes the sum of square differences between the binary template and the user labels, while preventing the template from shrinking, and penalizing for the inclusion of background elements into the final segmentation. We assess the performance of the proposed algorithm on synthetic images in which the amount of user annotation is controlled. We demonstrate our algorithm on the segmentation of the lungs of Mycobacterium tuberculosis infected mice from μCT images. PMID:26146532

  4. A primer on the study of transitory dynamics in ecological series using the scale-dependent correlation analysis.

    PubMed

    Rodríguez-Arias, Miquel Angel; Rodó, Xavier

    2004-03-01

    Here we describe a practical, step-by-step primer to scale-dependent correlation (SDC) analysis. The analysis of transitory processes is an important but often neglected topic in ecological studies because only a few statistical techniques appear to detect temporary features accurately enough. We introduce here the SDC analysis, a statistical and graphical method to study transitory processes at any temporal or spatial scale. SDC analysis, thanks to the combination of conventional procedures and simple well-known statistical techniques, becomes an improved time-domain analogue of wavelet analysis. We use several simple synthetic series to describe the method, a more complex example, full of transitory features, to compare SDC and wavelet analysis, and finally we analyze some selected ecological series to illustrate the methodology. The SDC analysis of time series of copepod abundances in the North Sea indicates that ENSO primarily is the main climatic driver of short-term changes in population dynamics. SDC also uncovers some long-term, unexpected features in the population. Similarly, the SDC analysis of Nicholson's blowflies data locates where the proposed models fail and provides new insights about the mechanism that drives the apparent vanishing of the population cycle during the second half of the series.

  5. A Recommended Procedure for Estimating the Cosmic-Ray Spectral Parameter of a Simple Power Law With Applications to Detector Design

    NASA Technical Reports Server (NTRS)

    Howell, L. W.

    2001-01-01

    A simple power law model consisting of a single spectral index alpha-1 is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV. Two procedures for estimating alpha-1 the method of moments and maximum likelihood (ML), are developed and their statistical performance compared. It is concluded that the ML procedure attains the most desirable statistical properties and is hence the recommended statistical estimation procedure for estimating alpha-1. The ML procedure is then generalized for application to a set of real cosmic-ray data and thereby makes this approach applicable to existing cosmic-ray data sets. Several other important results, such as the relationship between collecting power and detector energy resolution, as well as inclusion of a non-Gaussian detector response function, are presented. These results have many practical benefits in the design phase of a cosmic-ray detector as they permit instrument developers to make important trade studies in design parameters as a function of one of the science objectives. This is particularly important for space-based detectors where physical parameters, such as dimension and weight, impose rigorous practical limits to the design envelope.

  6. Random bursts determine dynamics of active filaments.

    PubMed

    Weber, Christoph A; Suzuki, Ryo; Schaller, Volker; Aranson, Igor S; Bausch, Andreas R; Frey, Erwin

    2015-08-25

    Constituents of living or synthetic active matter have access to a local energy supply that serves to keep the system out of thermal equilibrium. The statistical properties of such fluctuating active systems differ from those of their equilibrium counterparts. Using the actin filament gliding assay as a model, we studied how nonthermal distributions emerge in active matter. We found that the basic mechanism involves the interplay between local and random injection of energy, acting as an analog of a thermal heat bath, and nonequilibrium energy dissipation processes associated with sudden jump-like changes in the system's dynamic variables. We show here how such a mechanism leads to a nonthermal distribution of filament curvatures with a non-Gaussian shape. The experimental curvature statistics and filament relaxation dynamics are reproduced quantitatively by stochastic computer simulations and a simple kinetic model.

  7. Random bursts determine dynamics of active filaments

    PubMed Central

    Weber, Christoph A.; Suzuki, Ryo; Schaller, Volker; Aranson, Igor S.; Bausch, Andreas R.; Frey, Erwin

    2015-01-01

    Constituents of living or synthetic active matter have access to a local energy supply that serves to keep the system out of thermal equilibrium. The statistical properties of such fluctuating active systems differ from those of their equilibrium counterparts. Using the actin filament gliding assay as a model, we studied how nonthermal distributions emerge in active matter. We found that the basic mechanism involves the interplay between local and random injection of energy, acting as an analog of a thermal heat bath, and nonequilibrium energy dissipation processes associated with sudden jump-like changes in the system’s dynamic variables. We show here how such a mechanism leads to a nonthermal distribution of filament curvatures with a non-Gaussian shape. The experimental curvature statistics and filament relaxation dynamics are reproduced quantitatively by stochastic computer simulations and a simple kinetic model. PMID:26261319

  8. The NIST Simple Guide for Evaluating and Expressing Measurement Uncertainty

    NASA Astrophysics Data System (ADS)

    Possolo, Antonio

    2016-11-01

    NIST has recently published guidance on the evaluation and expression of the uncertainty of NIST measurement results [1, 2], supplementing but not replacing B. N. Taylor and C. E. Kuyatt's (1994) Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results (NIST Technical Note 1297) [3], which tracks closely the Guide to the expression of uncertainty in measurement (GUM) [4], originally published in 1995 by the Joint Committee for Guides in Metrology of the International Bureau of Weights and Measures (BIPM). The scope of this Simple Guide, however, is much broader than the scope of both NIST Technical Note 1297 and the GUM, because it attempts to address several of the uncertainty evaluation challenges that have arisen at NIST since the 1990s, for example to include molecular biology, greenhouse gases and climate science measurements, and forensic science. The Simple Guide also expands the scope of those two other guidance documents by recognizing observation equations (that is, statistical models) as bona fide measurement models. These models are indispensable to reduce data from interlaboratory studies, to combine measurement results for the same measurand obtained by different methods, and to characterize the uncertainty of calibration and analysis functions used in the measurement of force, temperature, or composition of gas mixtures. This presentation reviews the salient aspects of the Simple Guide, illustrates the use of models and methods for uncertainty evaluation not contemplated in the GUM, and also demonstrates the NIST Uncertainty Machine [5] and the NIST Consensus Builder, which are web-based applications accessible worldwide that facilitate evaluations of measurement uncertainty and the characterization of consensus values in interlaboratory studies.

  9. A simple rapid process for semi-automated brain extraction from magnetic resonance images of the whole mouse head.

    PubMed

    Delora, Adam; Gonzales, Aaron; Medina, Christopher S; Mitchell, Adam; Mohed, Abdul Faheem; Jacobs, Russell E; Bearer, Elaine L

    2016-01-15

    Magnetic resonance imaging (MRI) is a well-developed technique in neuroscience. Limitations in applying MRI to rodent models of neuropsychiatric disorders include the large number of animals required to achieve statistical significance, and the paucity of automation tools for the critical early step in processing, brain extraction, which prepares brain images for alignment and voxel-wise statistics. This novel timesaving automation of template-based brain extraction ("skull-stripping") is capable of quickly and reliably extracting the brain from large numbers of whole head images in a single step. The method is simple to install and requires minimal user interaction. This method is equally applicable to different types of MR images. Results were evaluated with Dice and Jacquard similarity indices and compared in 3D surface projections with other stripping approaches. Statistical comparisons demonstrate that individual variation of brain volumes are preserved. A downloadable software package not otherwise available for extraction of brains from whole head images is included here. This software tool increases speed, can be used with an atlas or a template from within the dataset, and produces masks that need little further refinement. Our new automation can be applied to any MR dataset, since the starting point is a template mask generated specifically for that dataset. The method reliably and rapidly extracts brain images from whole head images, rendering them useable for subsequent analytical processing. This software tool will accelerate the exploitation of mouse models for the investigation of human brain disorders by MRI. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Simple heuristics and rules of thumb: where psychologists and behavioural biologists might meet.

    PubMed

    Hutchinson, John M C; Gigerenzer, Gerd

    2005-05-31

    The Centre for Adaptive Behaviour and Cognition (ABC) has hypothesised that much human decision-making can be described by simple algorithmic process models (heuristics). This paper explains this approach and relates it to research in biology on rules of thumb, which we also review. As an example of a simple heuristic, consider the lexicographic strategy of Take The Best for choosing between two alternatives: cues are searched in turn until one discriminates, then search stops and all other cues are ignored. Heuristics consist of building blocks, and building blocks exploit evolved or learned abilities such as recognition memory; it is the complexity of these abilities that allows the heuristics to be simple. Simple heuristics have an advantage in making decisions fast and with little information, and in avoiding overfitting. Furthermore, humans are observed to use simple heuristics. Simulations show that the statistical structures of different environments affect which heuristics perform better, a relationship referred to as ecological rationality. We contrast ecological rationality with the stronger claim of adaptation. Rules of thumb from biology provide clearer examples of adaptation because animals can be studied in the environments in which they evolved. The range of examples is also much more diverse. To investigate them, biologists have sometimes used similar simulation techniques to ABC, but many examples depend on empirically driven approaches. ABC's theoretical framework can be useful in connecting some of these examples, particularly the scattered literature on how information from different cues is integrated. Optimality modelling is usually used to explain less detailed aspects of behaviour but might more often be redirected to investigate rules of thumb.

  11. Neuropsychological study of FASD in a sample of American Indian children: processing simple versus complex information.

    PubMed

    Aragón, Alfredo S; Kalberg, Wendy O; Buckley, David; Barela-Scott, Lindsey M; Tabachnick, Barbara G; May, Philip A

    2008-12-01

    Although a large body of literature exists on cognitive functioning in alcohol-exposed children, it is unclear if there is a signature neuropsychological profile in children with Fetal Alcohol Spectrum Disorders (FASD). This study assesses cognitive functioning in children with FASD from several American Indian reservations in the Northern Plains States, and it applies a hierarchical model of simple versus complex information processing to further examine cognitive function. We hypothesized that complex tests would discriminate between children with FASD and culturally similar controls, while children with FASD would perform similar to controls on relatively simple tests. Our sample includes 32 control children and 24 children with a form of FASD [fetal alcohol syndrome (FAS) = 10, partial fetal alcohol syndrome (PFAS) = 14]. The test battery measures general cognitive ability, verbal fluency, executive functioning, memory, and fine-motor skills. Many of the neuropsychological tests produced results consistent with a hierarchical model of simple versus complex processing. The complexity of the tests was determined "a priori" based on the number of cognitive processes involved in them. Multidimensional scaling was used to statistically analyze the accuracy of classifying the neurocognitive tests into a simple versus complex dichotomy. Hierarchical logistic regression models were then used to define the contribution made by complex versus simple tests in predicting the significant differences between children with FASD and controls. Complex test items discriminated better than simple test items. The tests that conformed well to the model were the Verbal Fluency, Progressive Planning Test (PPT), the Lhermitte memory tasks, and the Grooved Pegboard Test (GPT). The FASD-grouped children, when compared with controls, demonstrated impaired performance on letter fluency, while their performance was similar on category fluency. On the more complex PPT trials (problems 5 to 8), as well as the Lhermitte logical tasks, the FASD group performed the worst. The differential performance between children with FASD and controls was evident across various neuropsychological measures. The children with FASD performed significantly more poorly on the complex tasks than did the controls. The identification of a neurobehavioral profile in children with prenatal alcohol exposure will help clinicians identify and diagnose children with FASD.

  12. Statistical tools for transgene copy number estimation based on real-time PCR.

    PubMed

    Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

    2007-11-01

    As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.

  13. Simple model for multiple-choice collective decision making

    NASA Astrophysics Data System (ADS)

    Lee, Ching Hua; Lucas, Andrew

    2014-11-01

    We describe a simple model of heterogeneous, interacting agents making decisions between n ≥2 discrete choices. For a special class of interactions, our model is the mean field description of random field Potts-like models and is effectively solved by finding the extrema of the average energy E per agent. In these cases, by studying the propagation of decision changes via avalanches, we argue that macroscopic dynamics is well captured by a gradient flow along E . We focus on the permutation symmetric case, where all n choices are (on average) the same, and spontaneous symmetry breaking (SSB) arises purely from cooperative social interactions. As examples, we show that bimodal heterogeneity naturally provides a mechanism for the spontaneous formation of hierarchies between decisions and that SSB is a preferred instability to discontinuous phase transitions between two symmetric points. Beyond the mean field limit, exponentially many stable equilibria emerge when we place this model on a graph of finite mean degree. We conclude with speculation on decision making with persistent collective oscillations. Throughout the paper, we emphasize analogies between methods of solution to our model and common intuition from diverse areas of physics, including statistical physics and electromagnetism.

  14. Markov Logic Networks in the Analysis of Genetic Data

    PubMed Central

    Sakhanenko, Nikita A.

    2010-01-01

    Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249

  15. Biofilm development in fixed bed biofilm reactors: experiments and simple models for engineering design purposes.

    PubMed

    Szilágyi, N; Kovács, R; Kenyeres, I; Csikor, Zs

    2013-01-01

    Biofilm development in a fixed bed biofilm reactor system performing municipal wastewater treatment was monitored aiming at accumulating colonization and maximum biofilm mass data usable in engineering practice for process design purposes. Initially a 6 month experimental period was selected for investigations where the biofilm formation and the performance of the reactors were monitored. The results were analyzed by two methods: for simple, steady-state process design purposes the maximum biofilm mass on carriers versus influent load and a time constant of the biofilm growth were determined, whereas for design approaches using dynamic models a simple biofilm mass prediction model including attachment and detachment mechanisms was selected and fitted to the experimental data. According to a detailed statistical analysis, the collected data have not allowed us to determine both the time constant of biofilm growth and the maximum biofilm mass on carriers at the same time. The observed maximum biofilm mass could be determined with a reasonable error and ranged between 438 gTS/m(2) carrier surface and 843 gTS/m(2), depending on influent load, and hydrodynamic conditions. The parallel analysis of the attachment-detachment model showed that the experimental data set allowed us to determine the attachment rate coefficient which was in the range of 0.05-0.4 m d(-1) depending on influent load and hydrodynamic conditions.

  16. Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.

    PubMed

    Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi

    2016-05-01

    Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.

  17. Analysis and Modeling of the Arctic Oscillation Using a Simple Barotropic Model with Baroclinic Eddy Forcing.

    NASA Astrophysics Data System (ADS)

    Tanaka, H. L.

    2003-06-01

    In this study, a numerical simulation of the Arctic Oscillation (AO) is conducted using a simple barotropic model that considers the barotropic-baroclinic interactions as the external forcing. The model is referred to as a barotropic S model since the external forcing is obtained statistically from the long-term historical data, solving an inverse problem. The barotropic S model has been integrated for 51 years under a perpetual January condition and the dominant empirical orthogonal function (EOF) modes in the model have been analyzed. The results are compared with the EOF analysis of the barotropic component of the real atmosphere based on the daily NCEP-NCAR reanalysis for 50 yr from 1950 to 1999.According to the result, the first EOF of the model atmosphere appears to be the AO similar to the observation. The annular structure of the AO and the two centers of action at Pacific and Atlantic are simulated nicely by the barotropic S model. Therefore, the atmospheric low-frequency variabilities have been captured satisfactorily even by the simple barotropic model.The EOF analysis is further conducted to the external forcing of the barotropic S model. The structure of the dominant forcing shows the characteristics of synoptic-scale disturbances of zonal wavenumber 6 along the Pacific storm track. The forcing is induced by the barotropic-baroclinic interactions associated with baroclinic instability.The result suggests that the AO can be understood as the natural variability of the barotropic component of the atmosphere induced by the inherent barotropic dynamics, which is forced by the barotropic-baroclinic interactions. The fluctuating upscale energy cascade from planetary waves and synoptic disturbances to the zonal motion plays the key role for the excitation of the AO.

  18. Mitigating the impact of the DESI fiber assignment on galaxy clustering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burden, Angela; Padmanabhan, Nikhil; Cahn, Robert N.

    2017-03-01

    We present a simple strategy to mitigate the impact of an incomplete spectroscopic redshift galaxy sample as a result of fiber assignment and survey tiling. The method has been designed for the Dark Energy Spectroscopic Instrument (DESI) galaxy survey but may have applications beyond this. We propose a modification to the usual correlation function that nulls the almost purely angular modes affected by survey incompleteness due to fiber assignment. Predictions of this modified statistic can be calculated given a model of the two point correlation function. The new statistic can be computed with a slight modification to the data cataloguesmore » input to the standard correlation function code and does not incur any additional computational time. Finally we show that the spherically averaged baryon acoustic oscillation signal is not biased by the new statistic.« less

  19. Thermodynamics of ideal quantum gas with fractional statistics in D dimensions.

    PubMed

    Potter, Geoffrey G; Müller, Gerhard; Karbach, Michael

    2007-06-01

    We present exact and explicit results for the thermodynamic properties (isochores, isotherms, isobars, response functions, velocity of sound) of a quantum gas in dimensions D > or = 1 and with fractional exclusion statistics 0 < or = g < or =1 connecting bosons (g=0) and fermions (g=1) . In D=1 the results are equivalent to those of the Calogero-Sutherland model. Emphasis is given to the crossover between bosonlike and fermionlike features, caused by aspects of the statistical interaction that mimic long-range attraction and short-range repulsion. A phase transition along the isobar occurs at a nonzero temperature in all dimensions. The T dependence of the velocity of sound is in simple relation to isochores and isobars. The effects of soft container walls are accounted for rigorously for the case of a pure power-law potential.

  20. Predicting tidal currents in San Francisco Bay using a spectral model

    USGS Publications Warehouse

    Burau, Jon R.; Cheng, Ralph T.

    1988-01-01

    This paper describes the formulation of a spectral (or frequency based) model which solves the linearized shallow water equations. To account for highly variable basin bathymetry, spectral solutions are obtained using the finite element method which allows the strategic placement of the computation points in the specific areas of interest or in areas where the gradients of the dependent variables are expected to be large. Model results are compared with data using simple statistics to judge overall model performance in the San Francisco Bay estuary. Once the model is calibrated and verified, prediction of the tides and tidal currents in San Francisco Bay is accomplished by applying astronomical tides (harmonic constants deduced from field data) at the prediction time along the model boundaries.

  1. An improved empirical dynamic control system model of global mean sea level rise and surface temperature change

    NASA Astrophysics Data System (ADS)

    Wu, Qing; Luu, Quang-Hung; Tkalich, Pavel; Chen, Ge

    2018-04-01

    Having great impacts on human lives, global warming and associated sea level rise are believed to be strongly linked to anthropogenic causes. Statistical approach offers a simple and yet conceptually verifiable combination of remotely connected climate variables and indices, including sea level and surface temperature. We propose an improved statistical reconstruction model based on the empirical dynamic control system by taking into account the climate variability and deriving parameters from Monte Carlo cross-validation random experiments. For the historic data from 1880 to 2001, we yielded higher correlation results compared to those from other dynamic empirical models. The averaged root mean square errors are reduced in both reconstructed fields, namely, the global mean surface temperature (by 24-37%) and the global mean sea level (by 5-25%). Our model is also more robust as it notably diminished the unstable problem associated with varying initial values. Such results suggest that the model not only enhances significantly the global mean reconstructions of temperature and sea level but also may have a potential to improve future projections.

  2. Outbreak statistics and scaling laws for externally driven epidemics.

    PubMed

    Singh, Sarabjeet; Myers, Christopher R

    2014-04-01

    Power-law scalings are ubiquitous to physical phenomena undergoing a continuous phase transition. The classic susceptible-infectious-recovered (SIR) model of epidemics is one such example where the scaling behavior near a critical point has been studied extensively. In this system the distribution of outbreak sizes scales as P(n)∼n-3/2 at the critical point as the system size N becomes infinite. The finite-size scaling laws for the outbreak size and duration are also well understood and characterized. In this work, we report scaling laws for a model with SIR structure coupled with a constant force of infection per susceptible, akin to a "reservoir forcing". We find that the statistics of outbreaks in this system fundamentally differ from those in a simple SIR model. Instead of fixed exponents, all scaling laws exhibit tunable exponents parameterized by the dimensionless rate of external forcing. As the external driving rate approaches a critical value, the scale of the average outbreak size converges to that of the maximal size, and above the critical point, the scaling laws bifurcate into two regimes. Whereas a simple SIR process can only exhibit outbreaks of size O(N1/3) and O(N) depending on whether the system is at or above the epidemic threshold, a driven SIR process can exhibit a richer spectrum of outbreak sizes that scale as O(Nξ), where ξ∈(0,1]∖{2/3} and O((N/lnN)2/3) at the multicritical point.

  3. All individuals are not created equal; accounting for interindividual variation in fitting life-history responses to toxicants.

    PubMed

    Jager, Tjalling

    2013-02-05

    The individuals of a species are not equal. These differences frustrate experimental biologists and ecotoxicologists who wish to study the response of a species (in general) to a treatment. In the analysis of data, differences between model predictions and observations on individual animals are usually treated as random measurement error around the true response. These deviations, however, are mainly caused by real differences between the individuals (e.g., differences in physiology and in initial conditions). Understanding these intraspecies differences, and accounting for them in the data analysis, will improve our understanding of the response to the treatment we are investigating and allow for a more powerful, less biased, statistical analysis. Here, I explore a basic scheme for statistical inference to estimate parameters governing stress that allows individuals to differ in their basic physiology. This scheme is illustrated using a simple toxicokinetic-toxicodynamic model and a data set for growth of the springtail Folsomia candida exposed to cadmium in food. This article should be seen as proof of concept; a first step in bringing more realism into the statistical inference for process-based models in ecotoxicology.

  4. Improving Robot Locomotion Through Learning Methods for Expensive Black-Box Systems

    DTIC Science & Technology

    2013-11-01

    development of a class of “gradient free” optimization techniques; these include local approaches, such as a Nelder- Mead simplex search (c.f. [73]), and global...1Note that this simple method differs from the Nelder Mead constrained nonlinear optimization method [73]. 39 the Non-dominated Sorting Genetic Algorithm...Kober, and Jan Peters. Model-free inverse reinforcement learning. In International Conference on Artificial Intelligence and Statistics, 2011. [12] George

  5. The Forbes 400, the Pareto power-law and efficient markets

    NASA Astrophysics Data System (ADS)

    Klass, O. S.; Biham, O.; Levy, M.; Malcai, O.; Solomon, S.

    2007-01-01

    Statistical regularities at the top end of the wealth distribution in the United States are examined using the Forbes 400 lists of richest Americans, published between 1988 and 2003. It is found that the wealths are distributed according to a power-law (Pareto) distribution. This result is explained using a simple stochastic model of multiple investors that incorporates the efficient market hypothesis as well as the multiplicative nature of financial market fluctuations.

  6. Accumulated distribution of material gain at dislocation crystal growth

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rakin, V. I., E-mail: rakin@geo.komisc.ru

    2016-05-15

    A model for slowing down the tangential growth rate of an elementary step at dislocation crystal growth is proposed based on the exponential law of impurity particle distribution over adsorption energy. It is established that the statistical distribution of material gain on structurally equivalent faces obeys the Erlang law. The Erlang distribution is proposed to be used to calculate the occurrence rates of morphological combinatorial types of polyhedra, presenting real simple crystallographic forms.

  7. Neutral Evolution of Duplicated DNA: An Evolutionary Stick-Breaking Process Causes Scale-Invariant Behavior

    NASA Astrophysics Data System (ADS)

    Massip, Florian; Arndt, Peter F.

    2013-04-01

    Recently, an enrichment of identical matching sequences has been found in many eukaryotic genomes. Their length distribution exhibits a power law tail raising the question of what evolutionary mechanism or functional constraints would be able to shape this distribution. Here we introduce a simple and evolutionarily neutral model, which involves only point mutations and segmental duplications, and produces the same statistical features as observed for genomic data. Further, we extend a mathematical model for random stick breaking to analytically show that the exponent of the power law tail is -3 and universal as it does not depend on the microscopic details of the model.

  8. Chaotic Ising-like dynamics in traffic signals

    PubMed Central

    Suzuki, Hideyuki; Imura, Jun-ichi; Aihara, Kazuyuki

    2013-01-01

    The green and red lights of a traffic signal can be viewed as the up and down states of an Ising spin. Moreover, traffic signals in a city interact with each other, if they are controlled in a decentralised way. In this paper, a simple model of such interacting signals on a finite-size two-dimensional lattice is shown to have Ising-like dynamics that undergoes a ferromagnetic phase transition. Probabilistic behaviour of the model is realised by chaotic billiard dynamics that arises from coupled non-chaotic elements. This purely deterministic model is expected to serve as a starting point for considering statistical mechanics of traffic signals. PMID:23350034

  9. Estimation and model selection of semiparametric multivariate survival functions under general censorship.

    PubMed

    Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

    2010-07-01

    We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.

  10. Estimation and model selection of semiparametric multivariate survival functions under general censorship

    PubMed Central

    Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

    2013-01-01

    We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286

  11. Goodness-of-fit tests for open capture-recapture models

    USGS Publications Warehouse

    Pollock, K.H.; Hines, J.E.; Nichols, J.D.

    1985-01-01

    General goodness-of-fit tests for the Jolly-Seber model are proposed. These tests are based on conditional arguments using minimal sufficient statistics. The tests are shown to be of simple hypergeometric form so that a series of independent contingency table chi-square tests can be performed. The relationship of these tests to other proposed tests is discussed. This is followed by a simulation study of the power of the tests to detect departures from the assumptions of the Jolly-Seber model. Some meadow vole capture-recapture data are used to illustrate the testing procedure which has been implemented in a computer program available from the authors.

  12. A Bayesian Approach to Evaluating Consistency between Climate Model Output and Observations

    NASA Astrophysics Data System (ADS)

    Braverman, A. J.; Cressie, N.; Teixeira, J.

    2010-12-01

    Like other scientific and engineering problems that involve physical modeling of complex systems, climate models can be evaluated and diagnosed by comparing their output to observations of similar quantities. Though the global remote sensing data record is relatively short by climate research standards, these data offer opportunities to evaluate model predictions in new ways. For example, remote sensing data are spatially and temporally dense enough to provide distributional information that goes beyond simple moments to allow quantification of temporal and spatial dependence structures. In this talk, we propose a new method for exploiting these rich data sets using a Bayesian paradigm. For a collection of climate models, we calculate posterior probabilities its members best represent the physical system each seeks to reproduce. The posterior probability is based on the likelihood that a chosen summary statistic, computed from observations, would be obtained when the model's output is considered as a realization from a stochastic process. By exploring how posterior probabilities change with different statistics, we may paint a more quantitative and complete picture of the strengths and weaknesses of the models relative to the observations. We demonstrate our method using model output from the CMIP archive, and observations from NASA's Atmospheric Infrared Sounder.

  13. Survival analysis in hematologic malignancies: recommendations for clinicians

    PubMed Central

    Delgado, Julio; Pereira, Arturo; Villamor, Neus; López-Guillermo, Armando; Rozman, Ciril

    2014-01-01

    The widespread availability of statistical packages has undoubtedly helped hematologists worldwide in the analysis of their data, but has also led to the inappropriate use of statistical methods. In this article, we review some basic concepts of survival analysis and also make recommendations about how and when to perform each particular test using SPSS, Stata and R. In particular, we describe a simple way of defining cut-off points for continuous variables and the appropriate and inappropriate uses of the Kaplan-Meier method and Cox proportional hazard regression models. We also provide practical advice on how to check the proportional hazards assumption and briefly review the role of relative survival and multiple imputation. PMID:25176982

  14. Statistical theory of chromatography: new outlooks for affinity chromatography.

    PubMed Central

    Denizot, F C; Delaage, M A

    1975-01-01

    We have developed further the statistical approach to chromatography initiated by Giddings and Eyring, and applied it to affinity chromatography. By means of a convenient expression of moments the convergence towards the Laplace-Gauss distribution has been established. The Gaussian character is not preserved if other causes of dispersion are taken into account, but expressions of moments can be obtained in a generalized form. A simple procedure is deduced for expressing the fundamental constants of the model in terms of purely experimental quantities. Thus, affinity chromatography can be used to determine rate constants of association and dissociation in a range considered as the domain of the stopped-flow methods. PMID:1061072

  15. Statistical methods for astronomical data with upper limits. I - Univariate distributions

    NASA Technical Reports Server (NTRS)

    Feigelson, E. D.; Nelson, P. I.

    1985-01-01

    The statistical treatment of univariate censored data is discussed. A heuristic derivation of the Kaplan-Meier maximum-likelihood estimator from first principles is presented which results in an expression amenable to analytic error analysis. Methods for comparing two or more censored samples are given along with simple computational examples, stressing the fact that most astronomical problems involve upper limits while the standard mathematical methods require lower limits. The application of univariate survival analysis to six data sets in the recent astrophysical literature is described, and various aspects of the use of survival analysis in astronomy, such as the limitations of various two-sample tests and the role of parametric modelling, are discussed.

  16. Gain degradation and amplitude scintillation due to tropospheric turbulence

    NASA Technical Reports Server (NTRS)

    Theobold, D. M.; Hodge, D. B.

    1978-01-01

    It is shown that a simple physical model is adequate for the prediction of the long term statistics of both the reduced signal levels and increased peak-to-peak fluctuations. The model is based on conventional atmospheric turbulence theory and incorporates both amplitude and angle of arrival fluctuations. This model predicts the average variance of signals observed under clear air conditions at low elevation angles on earth-space paths at 2, 7.3, 20 and 30 GHz. Design curves based on this model for gain degradation, realizable gain, amplitude fluctuation as a function of antenna aperture size, frequency, and either terrestrial path length or earth-space path elevation angle are presented.

  17. [Bayesian statistics in medicine -- part II: main applications and inference].

    PubMed

    Montomoli, C; Nichelatti, M

    2008-01-01

    Bayesian statistics is not only used when one is dealing with 2-way tables, but it can be used for inferential purposes. Using the basic concepts presented in the first part, this paper aims to give a simple overview of Bayesian methods by introducing its foundation (Bayes' theorem) and then applying this rule to a very simple practical example; whenever possible, the elementary processes at the basis of analysis are compared to those of frequentist (classical) statistical analysis. The Bayesian reasoning is naturally connected to medical activity, since it appears to be quite similar to a diagnostic process.

  18. Multi-criterion model ensemble of CMIP5 surface air temperature over China

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Tao, Yumeng; Li, Jingjing; Zhu, Qian; Su, Lu; He, Xiaojia; Zhang, Xiaoming

    2018-05-01

    The global circulation models (GCMs) are useful tools for simulating climate change, projecting future temperature changes, and therefore, supporting the preparation of national climate adaptation plans. However, different GCMs are not always in agreement with each other over various regions. The reason is that GCMs' configurations, module characteristics, and dynamic forcings vary from one to another. Model ensemble techniques are extensively used to post-process the outputs from GCMs and improve the variability of model outputs. Root-mean-square error (RMSE), correlation coefficient (CC, or R) and uncertainty are commonly used statistics for evaluating the performances of GCMs. However, the simultaneous achievements of all satisfactory statistics cannot be guaranteed in using many model ensemble techniques. In this paper, we propose a multi-model ensemble framework, using a state-of-art evolutionary multi-objective optimization algorithm (termed MOSPD), to evaluate different characteristics of ensemble candidates and to provide comprehensive trade-off information for different model ensemble solutions. A case study of optimizing the surface air temperature (SAT) ensemble solutions over different geographical regions of China is carried out. The data covers from the period of 1900 to 2100, and the projections of SAT are analyzed with regard to three different statistical indices (i.e., RMSE, CC, and uncertainty). Among the derived ensemble solutions, the trade-off information is further analyzed with a robust Pareto front with respect to different statistics. The comparison results over historical period (1900-2005) show that the optimized solutions are superior over that obtained simple model average, as well as any single GCM output. The improvements of statistics are varying for different climatic regions over China. Future projection (2006-2100) with the proposed ensemble method identifies that the largest (smallest) temperature changes will happen in the South Central China (the Inner Mongolia), the North Eastern China (the South Central China), and the North Western China (the South Central China), under RCP 2.6, RCP 4.5, and RCP 8.5 scenarios, respectively.

  19. A Kp-based model of auroral boundaries

    NASA Astrophysics Data System (ADS)

    Carbary, James F.

    2005-10-01

    The auroral oval can serve as both a representation and a prediction of space weather on a global scale, so a competent model of the oval as a function of a geomagnetic index could conveniently appraise space weather itself. A simple model of the auroral boundaries is constructed by binning several months of images from the Polar Ultraviolet Imager by Kp index. The pixel intensities are first averaged into magnetic latitude-magnetic local time (MLT-MLAT) and local time bins, and intensity profiles are then derived for each Kp level at 1 hour intervals of MLT. After background correction, the boundary latitudes of each profile are determined at a threshold of 4 photons cm-2 s1. The peak locations and peak intensities are also found. The boundary and peak locations vary linearly with Kp index, and the coefficients of the linear fits are tabulated for each MLT. As a general rule of thumb, the UV intensity peak shifts 1° in magnetic latitude for each increment in Kp. The fits are surprisingly good for Kp < 6 but begin to deteriorate at high Kp because of auroral boundary irregularities and poor statistics. The statistical model allows calculation of the auroral boundaries at most MLTs as a function of Kp and can serve as an approximation to the shape and extent of the statistical oval.

  20. Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits.

    PubMed

    Bernhardt, Paul W; Wang, Huixia J; Zhang, Daowen

    2015-05-01

    Censored observations are a common occurrence in biomedical data sets. Although a large amount of research has been devoted to estimation and inference for data with censored responses, very little research has focused on proper statistical procedures when predictors are censored. In this paper, we consider statistical methods for dealing with multiple predictors subject to detection limits within the context of generalized linear models. We investigate and adapt several conventional methods and develop a new multiple imputation approach for analyzing data sets with predictors censored due to detection limits. We establish the consistency and asymptotic normality of the proposed multiple imputation estimator and suggest a computationally simple and consistent variance estimator. We also demonstrate that the conditional mean imputation method often leads to inconsistent estimates in generalized linear models, while several other methods are either computationally intensive or lead to parameter estimates that are biased or more variable compared to the proposed multiple imputation estimator. In an extensive simulation study, we assess the bias and variability of different approaches within the context of a logistic regression model and compare variance estimation methods for the proposed multiple imputation estimator. Lastly, we apply several methods to analyze the data set from a recently-conducted GenIMS study.

  1. The effect of a graphical interpretation of a statistic trend indicator (Trigg's Tracking Variable) on the detection of simulated changes.

    PubMed

    Kennedy, R R; Merry, A F

    2011-09-01

    Anaesthesia involves processing large amounts of information over time. One task of the anaesthetist is to detect substantive changes in physiological variables promptly and reliably. It has been previously demonstrated that a graphical trend display of historical data leads to more rapid detection of such changes. We examined the effect of a graphical indication of the magnitude of Trigg's Tracking Variable, a simple statistically based trend detection algorithm, on the accuracy and latency of the detection of changes in a micro-simulation. Ten anaesthetists each viewed 20 simulations with four variables displayed as the current value with a simple graphical trend display. Values for these variables were generated by a computer model, and updated every second; after a period of stability a change occurred to a new random value at least 10 units from baseline. In 50% of the simulations an indication of the rate of change was given by a five level graphical representation of the value of Trigg's Tracking Variable. Participants were asked to indicate when they thought a change was occurring. Changes were detected 10.9% faster with the trend indicator present (mean 13.1 [SD 3.1] cycles vs 14.6 [SD 3.4] cycles, 95% confidence interval 0.4 to 2.5 cycles, P = 0.013. There was no difference in accuracy of detection (median with trend detection 97% [interquartile range 95 to 100%], without trend detection 100% [98 to 100%]), P = 0.8. We conclude that simple statistical trend detection may speed detection of changes during routine anaesthesia, even when a graphical trend display is present.

  2. Statistics without Tears: Complex Statistics with Simple Arithmetic

    ERIC Educational Resources Information Center

    Smith, Brian

    2011-01-01

    One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…

  3. Applied statistics in ecology: common pitfalls and simple solutions

    Treesearch

    E. Ashley Steel; Maureen C. Kennedy; Patrick G. Cunningham; John S. Stanovick

    2013-01-01

    The most common statistical pitfalls in ecological research are those associated with data exploration, the logic of sampling and design, and the interpretation of statistical results. Although one can find published errors in calculations, the majority of statistical pitfalls result from incorrect logic or interpretation despite correct numerical calculations. There...

  4. [Effect of somatostatin-14 in simple mechanical obstruction of the small intestine].

    PubMed

    Jimenez-Garcia, A; Ahmad Araji, O; Balongo Garcia, R; Nogales Munoz, A; Salguero Villadiego, M; Cantillana Martinez, J

    1994-02-01

    In order to investigate the properties of somatostatin-14 we studied an experimental model of simple mechanical and closed loop occlusion. Forty-eight New Zealand rabbits were assigned randomly to three groups of 16: group C (controls) was operated and treated with saline solution (4 cc/Kg/h); group A was operated and initially treated with saline solution and an equal dose of somatostatin-14 (3.5 micrograms/Kg/h; and group B was operated and treated in the same manner as group A, but later, 8 hours after the laparotomy. The animals were sacrificed 24 hours later; intestinal secretion was quantified, blood and intestinal fluid chemistries were performed and specimens of the intestine were prepared for histological examination. Descriptive statistical analysis of the results was performed with the ANOVA, a semi-quantitative test and the covariance test. Somatostatin-14 produced an improvement in the volume of intestinal secretion in the treated groups compared with the control group. The results were statistically significant in group B treated after an 8-hour delay: closed loop (ml): 6.40 +/- 1.12, 2.50 +/- 0.94, 1.85 +/- 0.83 and simple mechanical occlusion (ml): 175 +/- 33.05, 89.50 +/- 9.27, 57.18 +/- 21.23, p < 0.01 for groups C, A and B C, A and B respectively. Net secretion of Cl and Na ions was also improved, p < 0.01.(ABSTRACT TRUNCATED AT 250 WORDS)

  5. The Statistics of wood assays for preservative retention

    Treesearch

    Patricia K. Lebow; Scott W. Conklin

    2011-01-01

    This paper covers general statistical concepts that apply to interpreting wood assay retention values. In particular, since wood assays are typically obtained from a single composited sample, the statistical aspects, including advantages and disadvantages, of simple compositing are covered.

  6. On the Use of Ocean Dynamic Temperature for Hurricane Intensity Forecasting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balaguru, Karthik; Foltz, Gregory R.; Leung, L. Ruby

    Sea surface temperature (SST) and the Tropical Cyclone Heat Potential (TCHP) are metrics used to incorporate the ocean's influence on hurricane intensification in the National Hurricane Center's Statistical Hurricane Intensity Prediction Scheme (SHIPS). While both SST and TCHP serve as useful measures of the upper-ocean heat content, they do not accurately represent ocean stratification effects. Here we show that replacing SST in the SHIPS framework with a dynamic temperature (Tdy), which accounts for the oceanic negative feedback to the hurricane's intensity arising from storm-induced vertical mixing and sea-surface cooling, improves the model performance. While the model with SST and TCHPmore » explains nearly 41% of the variance in 36-hr intensity changes, replacing SST with Tdy increases the variance explained to nearly 44%. Our results suggest that representation of the oceanic feedback, even through relatively simple formulations such as Tdy, may improve the performance of statistical hurricane intensity prediction models such as SHIPS.« less

  7. Self-affirmation model for football goal distributions

    NASA Astrophysics Data System (ADS)

    Bittner, E.; Nußbaumer, A.; Janke, W.; Weigel, M.

    2007-06-01

    Analyzing football score data with statistical techniques, we investigate how the highly co-operative nature of the game is reflected in averaged properties such as the distributions of scored goals for the home and away teams. It turns out that in particular the tails of the distributions are not well described by independent Bernoulli trials, but rather well modeled by negative binomial or generalized extreme value distributions. To understand this behavior from first principles, we suggest to modify the Bernoulli random process to include a simple component of self-affirmation which seems to describe the data surprisingly well and allows to interpret the observed deviation from Gaussian statistics. The phenomenological distributions used before can be understood as special cases within this framework. We analyzed historical football score data from many leagues in Europe as well as from international tournaments and found the proposed models to be applicable rather universally. In particular, here we compare men's and women's leagues and the separate German leagues during the cold war times and find some remarkable differences.

  8. Moving line model and avalanche statistics of Bingham fluid flow in porous media.

    PubMed

    Chevalier, Thibaud; Talon, Laurent

    2015-07-01

    In this article, we propose a simple model to understand the critical behavior of path opening during flow of a yield stress fluid in porous media as numerically observed by Chevalier and Talon (2015). This model can be mapped to the problem of a contact line moving in an heterogeneous field. Close to the critical point, this line presents an avalanche dynamic where the front advances by a succession of waiting time and large burst events. These burst events are then related to the non-flowing (i.e. unyielded) areas. Remarkably, the statistics of these areas reproduce the same properties as in the direct numerical simulations. Furthermore, even if our exponents seem to be close to the mean field universal exponents, we report an unusual bump in the distribution which depends on the disorder. Finally, we identify a scaling invariance of the cluster spatial shape that is well fit, to first order, by a self-affine parabola.

  9. Indirect Reconstruction of Pore Morphology for Parametric Computational Characterization of Unidirectional Porous Iron.

    PubMed

    Kovačič, Aljaž; Borovinšek, Matej; Vesenjak, Matej; Ren, Zoran

    2018-01-26

    This paper addresses the problem of reconstructing realistic, irregular pore geometries of lotus-type porous iron for computer models that allow for simple porosity and pore size variation in computational characterization of their mechanical properties. The presented methodology uses image-recognition algorithms for the statistical analysis of pore morphology in real material specimens, from which a unique fingerprint of pore morphology at a certain porosity level is derived. The representative morphology parameter is introduced and used for the indirect reconstruction of realistic and statistically representative pore morphologies, which can be used for the generation of computational models with an arbitrary porosity. Such models were subjected to parametric computer simulations to characterize the dependence of engineering elastic modulus on the porosity of lotus-type porous iron. The computational results are in excellent agreement with experimental observations, which confirms the suitability of the presented methodology of indirect pore geometry reconstruction for computational simulations of similar porous materials.

  10. Statistics of gravitational lenses - The uncertainties

    NASA Technical Reports Server (NTRS)

    Mao, Shude

    1991-01-01

    The assumptions in the analysis of gravitational lensing statistics are examined. Special emphasis is given to the uncertainties in the theoretical predictions. It is shown that a simple redshift cutoff model, which may result from galaxy evolution, can significantly reduce the lensing probability and explain the large mean separation of images in observed gravitational lenses. This effect may affect the constraint on the contribution of the cosmological constant to producing a flat universe from the number counts of the observed lenses. For the Omega(0) = 1 (filled beam) model, the lensing probability of early-type galaxies with finite core radii is reduced roughly by a factor of 2 for high-redshift quasars as compared with the corresponding singular isothermal sphere model. The finite core radius effect is about 20 percent for a lambda-dominated flat universe. It is also shown that the most recent galaxy luminosity function gives lensing probabilities that are smaller than previously estimated roughly by a factor of 3.

  11. Predicting survival of Escherichia coli O157:H7 in dry fermented sausage using artificial neural networks.

    PubMed

    Palanichamy, A; Jayas, D S; Holley, R A

    2008-01-01

    The Canadian Food Inspection Agency required the meat industry to ensure Escherichia coli O157:H7 does not survive (experiences > or = 5 log CFU/g reduction) in dry fermented sausage (salami) during processing after a series of foodborne illness outbreaks resulting from this pathogenic bacterium occurred. The industry is in need of an effective technique like predictive modeling for estimating bacterial viability, because traditional microbiological enumeration is a time-consuming and laborious method. The accuracy and speed of artificial neural networks (ANNs) for this purpose is an attractive alternative (developed from predictive microbiology), especially for on-line processing in industry. Data from a study of interactive effects of different levels of pH, water activity, and the concentrations of allyl isothiocyanate at various times during sausage manufacture in reducing numbers of E. coli O157:H7 were collected. Data were used to develop predictive models using a general regression neural network (GRNN), a form of ANN, and a statistical linear polynomial regression technique. Both models were compared for their predictive error, using various statistical indices. GRNN predictions for training and test data sets had less serious errors when compared with the statistical model predictions. GRNN models were better and slightly better for training and test sets, respectively, than was the statistical model. Also, GRNN accurately predicted the level of allyl isothiocyanate required, ensuring a 5-log reduction, when an appropriate production set was created by interpolation. Because they are simple to generate, fast, and accurate, ANN models may be of value for industrial use in dry fermented sausage manufacture to reduce the hazard associated with E. coli O157:H7 in fresh beef and permit production of consistently safe products from this raw material.

  12. Assessing the fit of site-occupancy models

    USGS Publications Warehouse

    MacKenzie, D.I.; Bailey, L.L.

    2004-01-01

    Few species are likely to be so evident that they will always be detected at a site when present. Recently a model has been developed that enables estimation of the proportion of area occupied, when the target species is not detected with certainty. Here we apply this modeling approach to data collected on terrestrial salamanders in the Plethodon glutinosus complex in the Great Smoky Mountains National Park, USA, and wish to address the question 'how accurately does the fitted model represent the data?' The goodness-of-fit of the model needs to be assessed in order to make accurate inferences. This article presents a method where a simple Pearson chi-square statistic is calculated and a parametric bootstrap procedure is used to determine whether the observed statistic is unusually large. We found evidence that the most global model considered provides a poor fit to the data, hence estimated an overdispersion factor to adjust model selection procedures and inflate standard errors. Two hypothetical datasets with known assumption violations are also analyzed, illustrating that the method may be used to guide researchers to making appropriate inferences. The results of a simulation study are presented to provide a broader view of the methods properties.

  13. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    PubMed

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  14. State-space models’ dirty little secrets: even simple linear Gaussian models can have estimation problems

    NASA Astrophysics Data System (ADS)

    Auger-Méthé, Marie; Field, Chris; Albertsen, Christoffer M.; Derocher, Andrew E.; Lewis, Mark A.; Jonsen, Ian D.; Mills Flemming, Joanna

    2016-05-01

    State-space models (SSMs) are increasingly used in ecology to model time-series such as animal movement paths and population dynamics. This type of hierarchical model is often structured to account for two levels of variability: biological stochasticity and measurement error. SSMs are flexible. They can model linear and nonlinear processes using a variety of statistical distributions. Recent ecological SSMs are often complex, with a large number of parameters to estimate. Through a simulation study, we show that even simple linear Gaussian SSMs can suffer from parameter- and state-estimation problems. We demonstrate that these problems occur primarily when measurement error is larger than biological stochasticity, the condition that often drives ecologists to use SSMs. Using an animal movement example, we show how these estimation problems can affect ecological inference. Biased parameter estimates of a SSM describing the movement of polar bears (Ursus maritimus) result in overestimating their energy expenditure. We suggest potential solutions, but show that it often remains difficult to estimate parameters. While SSMs are powerful tools, they can give misleading results and we urge ecologists to assess whether the parameters can be estimated accurately before drawing ecological conclusions from their results.

  15. THE DISTRIBUTION OF COOK’S D STATISTIC

    PubMed Central

    Muller, Keith E.; Mok, Mario Chen

    2013-01-01

    Cook (1977) proposed a diagnostic to quantify the impact of deleting an observation on the estimated regression coefficients of a General Linear Univariate Model (GLUM). Simulations of models with Gaussian response and predictors demonstrate that his suggestion of comparing the diagnostic to the median of the F for overall regression captures an erratically varying proportion of the values. We describe the exact distribution of Cook’s statistic for a GLUM with Gaussian predictors and response. We also present computational forms, simple approximations, and asymptotic results. A simulation supports the accuracy of the results. The methods allow accurate evaluation of a single value or the maximum value from a regression analysis. The approximations work well for a single value, but less well for the maximum. In contrast, the cut-point suggested by Cook provides widely varying tail probabilities. As with all diagnostics, the data analyst must use scientific judgment in deciding how to treat highlighted observations. PMID:24363487

  16. Optimizing Integrated Terminal Airspace Operations Under Uncertainty

    NASA Technical Reports Server (NTRS)

    Bosson, Christabelle; Xue, Min; Zelinski, Shannon

    2014-01-01

    In the terminal airspace, integrated departures and arrivals have the potential to increase operations efficiency. Recent research has developed geneticalgorithm- based schedulers for integrated arrival and departure operations under uncertainty. This paper presents an alternate method using a machine jobshop scheduling formulation to model the integrated airspace operations. A multistage stochastic programming approach is chosen to formulate the problem and candidate solutions are obtained by solving sample average approximation problems with finite sample size. Because approximate solutions are computed, the proposed algorithm incorporates the computation of statistical bounds to estimate the optimality of the candidate solutions. A proof-ofconcept study is conducted on a baseline implementation of a simple problem considering a fleet mix of 14 aircraft evolving in a model of the Los Angeles terminal airspace. A more thorough statistical analysis is also performed to evaluate the impact of the number of scenarios considered in the sampled problem. To handle extensive sampling computations, a multithreading technique is introduced.

  17. Stationarity: Wanted dead or alive?

    USGS Publications Warehouse

    Lins, H.F.; Cohn, T.A.

    2011-01-01

    Aligning engineering practice with natural process behavior would appear, on its face, to be a prudent and reasonable course of action. However, if we do not understand the long-term characteristics of hydroclimatic processes, how does one find the prudent and reasonable course needed for water management? We consider this question in light of three aspects of existing and unresolved issues affecting hydroclimatic variability and statistical inference: Hurst-Kolmogorov phenomena; the complications long-term persistence introduces with respect to statistical understanding; and the dependence of process understanding on arbitrary sampling choices. These problems are not easily addressed. In such circumstances, humility may be more important than physics; a simple model with well-understood flaws may be preferable to a sophisticated model whose correspondence to reality is uncertain. ?? 2011 American Water Resources Association. This article is a U.S. Government work and is in the public domain in the USA.

  18. Statistical mechanics of an ideal active fluid confined in a channel

    NASA Astrophysics Data System (ADS)

    Wagner, Caleb; Baskaran, Aparna; Hagan, Michael

    The statistical mechanics of ideal active Brownian particles (ABPs) confined in a channel is studied by obtaining the exact solution of the steady-state Smoluchowski equation for the 1-particle distribution function. The solution is derived using results from the theory of two-way diffusion equations, combined with an iterative procedure that is justified by numerical results. Using this solution, we quantify the effects of confinement on the spatial and orientational order of the ensemble. Moreover, we rigorously show that both the bulk density and the fraction of particles on the channel walls obey simple scaling relations as a function of channel width. By considering a constant-flux steady state, an effective diffusivity for ABPs is derived which shows signatures of the persistent motion that characterizes ABP trajectories. Finally, we discuss how our techniques generalize to other active models, including systems whose activity is modeled in terms of an Ornstein-Uhlenbeck process.

  19. Modeling the Development of Audiovisual Cue Integration in Speech Perception

    PubMed Central

    Getz, Laura M.; Nordeen, Elke R.; Vrabic, Sarah C.; Toscano, Joseph C.

    2017-01-01

    Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues. PMID:28335558

  20. Modeling the Development of Audiovisual Cue Integration in Speech Perception.

    PubMed

    Getz, Laura M; Nordeen, Elke R; Vrabic, Sarah C; Toscano, Joseph C

    2017-03-21

    Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues.

  1. The impacts of recent smoking control policies on individual smoking choice: the case of Japan

    PubMed Central

    2013-01-01

    Abstract This article comprehensively examines the impact of recent smoking control policies in Japan, increases in cigarette taxes and the enforcement of the Health Promotion Law, on individual smoking choice by using multi-year and nationwide individual survey data to overcome the analytical problems of previous Japanese studies. In the econometric analyses, I specify a simple binary choice model based on a random utility model to examine the effects of smoking control policies on individual smoking choice by employing the instrumental variable probit model to control for the endogeneity of cigarette prices. The empirical results show that an increase in cigarette prices statistically significantly reduces the smoking probability of males by 1.0 percent and that of females by 1.4 to 2.0 percent. The enforcement of the Health Promotion Law has a statistically significant effect on reducing the smoking probability of males by 15.2 percent and of females by 11.9 percent. Furthermore, an increase in cigarette prices has a statistically significant negative effect on the smoking probability of office workers, non-workers, male manual workers, and female unemployed people, and the enforcement of the Health Promotion Law has a statistically significant effect on decreasing the smoking probabilities of office workers, female manual workers, and male non-workers. JEL classification C25, C26, I18 PMID:23497490

  2. A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

    PubMed

    Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

    2012-01-01

    High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.

  3. A statistical and experimental approach for assessing the preservation of plant lipids in soil

    NASA Astrophysics Data System (ADS)

    Mueller, K. E.; Eissenstat, D. M.; Oleksyn, J.; Freeman, K. H.

    2011-12-01

    Plant-derived lipids contribute to stable soil organic matter, but further interpretations of their abundance in soils are limited because the factors that control lipid preservation are poorly understood. Using data from a long-term field experiment and simple statistical models, we provide novel constraints on several predictors of the concentration of hydrolyzable lipids in forest mineral soils. Focal lipids included common monomers of cutin, suberin, and plant waxes present in tree leaves and roots. Soil lipid concentrations were most strongly influenced by the concentrations of lipids in leaves and roots of the overlying trees, but were also affected by the type of lipid (e.g. alcohols vs. acids), lipid chain length, and whether lipids originated in leaves or roots. Collectively, these factors explained ~80% of the variation in soil lipid concentrations beneath 11 different tree species. In order to use soil lipid analyses to test and improve conceptual models of soil organic matter stabilization, additional studies that provide experimental and quantitative (i.e. statistical) constraints on plant lipid preservation are needed.

  4. Statistical Issues for Uncontrolled Reentry Hazards

    NASA Technical Reports Server (NTRS)

    Matney, Mark

    2008-01-01

    A number of statistical tools have been developed over the years for assessing the risk of reentering objects to human populations. These tools make use of the characteristics (e.g., mass, shape, size) of debris that are predicted by aerothermal models to survive reentry. The statistical tools use this information to compute the probability that one or more of the surviving debris might hit a person on the ground and cause one or more casualties. The statistical portion of the analysis relies on a number of assumptions about how the debris footprint and the human population are distributed in latitude and longitude, and how to use that information to arrive at realistic risk numbers. This inevitably involves assumptions that simplify the problem and make it tractable, but it is often difficult to test the accuracy and applicability of these assumptions. This paper looks at a number of these theoretical assumptions, examining the mathematical basis for the hazard calculations, and outlining the conditions under which the simplifying assumptions hold. In addition, this paper will also outline some new tools for assessing ground hazard risk in useful ways. Also, this study is able to make use of a database of known uncontrolled reentry locations measured by the United States Department of Defense. By using data from objects that were in orbit more than 30 days before reentry, sufficient time is allowed for the orbital parameters to be randomized in the way the models are designed to compute. The predicted ground footprint distributions of these objects are based on the theory that their orbits behave basically like simple Kepler orbits. However, there are a number of factors - including the effects of gravitational harmonics, the effects of the Earth's equatorial bulge on the atmosphere, and the rotation of the Earth and atmosphere - that could cause them to diverge from simple Kepler orbit behavior and change the ground footprints. The measured latitude and longitude distributions of these objects provide data that can be directly compared with the predicted distributions, providing a fundamental empirical test of the model assumptions.

  5. An Application of Epidemiological Modeling to Information Diffusion

    NASA Astrophysics Data System (ADS)

    McCormack, Robert; Salter, William

    Messages often spread within a population through unofficial - particularly web-based - media. Such ideas have been termed "memes." To impede the flow of terrorist messages and to promote counter messages within a population, intelligence analysts must understand how messages spread. We used statistical language processing technologies to operationalize "memes" as latent topics in electronic text and applied epidemiological techniques to describe and analyze patterns of message propagation. We developed our methods and applied them to English-language newspapers and blogs in the Arab world. We found that a relatively simple epidemiological model can reproduce some dynamics of observed empirical relationships.

  6. Interesting examples of supervised continuous variable systems

    NASA Technical Reports Server (NTRS)

    Chase, Christopher; Serrano, Joe; Ramadge, Peter

    1990-01-01

    The authors analyze two simple deterministic flow models for multiple buffer servers which are examples of the supervision of continuous variable systems by a discrete controller. These systems exhibit what may be regarded as the two extremes of complexity of the closed loop behavior: one is eventually periodic, the other is chaotic. The first example exhibits chaotic behavior that could be characterized statistically. The dual system, the switched server system, exhibits very predictable behavior, which is modeled by a finite state automaton. This research has application to multimodal discrete time systems where the controller can choose from a set of transition maps to implement.

  7. On wildfire complexity, simple models and environmental templates for fire size distributions

    NASA Astrophysics Data System (ADS)

    Boer, M. M.; Bradstock, R.; Gill, M.; Sadler, R.

    2012-12-01

    Vegetation fires affect some 370 Mha annually. At global and continental scales, fire activity follows predictable spatiotemporal patterns driven by gradients and seasonal fluctuations of primary productivity and evaporative demand that set constraints for fuel accumulation rates and fuel dryness, two key ingredients of fire. At regional scales, fires are also known to affect some landscapes more than others and within landscapes to occur preferentially in some sectors (e.g. wind-swept ridges) and rarely in others (e.g. wet gullies). Another common observation is that small fires occur relatively frequent yet collectively burn far less country than relatively infrequent large fires. These patterns of fire activity are well known to management agencies and consistent with their (informal) models of how the basic drivers and constraints of fire (i.e. fuels, ignitions, weather) vary in time and space across the landscape. The statistical behaviour of these landscape fire patterns has excited the (academic) research community by showing some consistency with that of complex dynamical systems poised at a phase transition. The common finding that the frequency-size distributions of actual fires follow power laws that resemble those produced by simple cellular models from statistical mechanics has been interpreted as evidence that flammable landscapes operate as self-organising systems with scale invariant fire size distributions emerging 'spontaneously' from simple rules of contagious fire spread and a strong feedback between fires and fuel patterns. In this paper we argue that the resemblance of simulated and actual fire size distributions is an example of equifinality, that is fires in model landscapes and actual landscapes may show similar statistical behaviour but this is reached by qualitatively different pathways or controlling mechanisms. We support this claim with two key findings regarding simulated fire spread mechanisms and fire-fuel feedbacks. Firstly, we demonstrate that the power law behaviour of fire size distributions in the widely used Drossel and Schwabl (1992) Forest Fire Model (FFM) is strictly conditional on simulating fire spread as a cell-to-cell contagion over a fixed distance; the invariant scaling of fire sizes breaks down under the slightest variation in that distance, suggesting that pattern formation in the FFM is irreconcilable with the reality of disparate rates and modes of fire spread observed in the field. Secondly, we review field evidence showing that fuel age effects on the probability of fire spread, a key assumption in simulation models like the FFM, do not generally apply across flammable environments. Finally, we explore alternative explanations for the formation of scale invariant fire sizes in real landscapes. Using observations from southern Australian forest regions we demonstrate that the spatiotemporal patterns of fuel dryness and magnitudes of fire driving weather events set strong environmental templates for regional fire size distributions.

  8. Admixture, Population Structure, and F-Statistics.

    PubMed

    Peter, Benjamin M

    2016-04-01

    Many questions about human genetic history can be addressed by examining the patterns of shared genetic variation between sets of populations. A useful methodological framework for this purpose isF-statistics that measure shared genetic drift between sets of two, three, and four populations and can be used to test simple and complex hypotheses about admixture between populations. This article provides context from phylogenetic and population genetic theory. I review how F-statistics can be interpreted as branch lengths or paths and derive new interpretations, using coalescent theory. I further show that the admixture tests can be interpreted as testing general properties of phylogenies, allowing extension of some ideas applications to arbitrary phylogenetic trees. The new results are used to investigate the behavior of the statistics under different models of population structure and show how population substructure complicates inference. The results lead to simplified estimators in many cases, and I recommend to replace F3 with the average number of pairwise differences for estimating population divergence. Copyright © 2016 by the Genetics Society of America.

  9. Structure-guided statistical textural distinctiveness for salient region detection in natural images.

    PubMed

    Scharfenberger, Christian; Wong, Alexander; Clausi, David A

    2015-01-01

    We propose a simple yet effective structure-guided statistical textural distinctiveness approach to salient region detection. Our method uses a multilayer approach to analyze the structural and textural characteristics of natural images as important features for salient region detection from a scale point of view. To represent the structural characteristics, we abstract the image using structured image elements and extract rotational-invariant neighborhood-based textural representations to characterize each element by an individual texture pattern. We then learn a set of representative texture atoms for sparse texture modeling and construct a statistical textural distinctiveness matrix to determine the distinctiveness between all representative texture atom pairs in each layer. Finally, we determine saliency maps for each layer based on the occurrence probability of the texture atoms and their respective statistical textural distinctiveness and fuse them to compute a final saliency map. Experimental results using four public data sets and a variety of performance evaluation metrics show that our approach provides promising results when compared with existing salient region detection approaches.

  10. A simple stochastic rainstorm generator for simulating spatially and temporally varying rainfall

    NASA Astrophysics Data System (ADS)

    Singer, M. B.; Michaelides, K.; Nichols, M.; Nearing, M. A.

    2016-12-01

    In semi-arid to arid drainage basins, rainstorms often control both water supply and flood risk to marginal communities of people. They also govern the availability of water to vegetation and other ecological communities, as well as spatial patterns of sediment, nutrient, and contaminant transport and deposition on local to basin scales. All of these landscape responses are sensitive to changes in climate that are projected to occur throughout western North America. Thus, it is important to improve characterization of rainstorms in a manner that enables statistical assessment of rainfall at spatial scales below that of existing gauging networks and the prediction of plausible manifestations of climate change. Here we present a simple, stochastic rainstorm generator that was created using data from a rich and dense network of rain gauges at the Walnut Gulch Experimental Watershed (WGEW) in SE Arizona, but which is applicable anywhere. We describe our methods for assembling pdfs of relevant rainstorm characteristics including total annual rainfall, storm area, storm center location, and storm duration. We also generate five fitted intensity-duration curves and apply a spatial rainfall gradient to generate precipitation at spatial scales below gauge spacing. The model then runs by Monte Carlo simulation in which a total annual rainfall is selected before we generate rainstorms until the annual precipitation total is reached. The procedure continues for decadal simulations. Thus, we keep track of the hydrologic impact of individual storms and the integral of precipitation over multiple decades. We first test the model using ensemble predictions until we reach statistical similarity to the input data from WGEW. We then employ the model to assess decadal precipitation under simulations of climate change in which we separately vary the distribution of total annual rainfall (trend in moisture) and the intensity-duration curves used for simulation (trends in storminess). We demonstrate the model output through spatial maps of rainfall and through statistical comparisons of relevant parameters and distributions. Finally, discuss how the model can be used to understand basin-scale hydrology in terms of soil moisture, runoff, and erosion.

  11. Complex Sequencing Rules of Birdsong Can be Explained by Simple Hidden Markov Processes

    PubMed Central

    Katahira, Kentaro; Suzuki, Kenta; Okanoya, Kazuo; Okada, Masato

    2011-01-01

    Complex sequencing rules observed in birdsongs provide an opportunity to investigate the neural mechanism for generating complex sequential behaviors. To relate the findings from studying birdsongs to other sequential behaviors such as human speech and musical performance, it is crucial to characterize the statistical properties of the sequencing rules in birdsongs. However, the properties of the sequencing rules in birdsongs have not yet been fully addressed. In this study, we investigate the statistical properties of the complex birdsong of the Bengalese finch (Lonchura striata var. domestica). Based on manual-annotated syllable labeles, we first show that there are significant higher-order context dependencies in Bengalese finch songs, that is, which syllable appears next depends on more than one previous syllable. We then analyze acoustic features of the song and show that higher-order context dependencies can be explained using first-order hidden state transition dynamics with redundant hidden states. This model corresponds to hidden Markov models (HMMs), well known statistical models with a large range of application for time series modeling. The song annotation with these models with first-order hidden state dynamics agreed well with manual annotation, the score was comparable to that of a second-order HMM, and surpassed the zeroth-order model (the Gaussian mixture model; GMM), which does not use context information. Our results imply that the hierarchical representation with hidden state dynamics may underlie the neural implementation for generating complex behavioral sequences with higher-order dependencies. PMID:21915345

  12. Are running speeds maximized with simple-spring stance mechanics?

    PubMed

    Clark, Kenneth P; Weyand, Peter G

    2014-09-15

    Are the fastest running speeds achieved using the simple-spring stance mechanics predicted by the classic spring-mass model? We hypothesized that a passive, linear-spring model would not account for the running mechanics that maximize ground force application and speed. We tested this hypothesis by comparing patterns of ground force application across athletic specialization (competitive sprinters vs. athlete nonsprinters, n = 7 each) and running speed (top speeds vs. slower ones). Vertical ground reaction forces at 5.0 and 7.0 m/s, and individual top speeds (n = 797 total footfalls) were acquired while subjects ran on a custom, high-speed force treadmill. The goodness of fit between measured vertical force vs. time waveform patterns and the patterns predicted by the spring-mass model were assessed using the R(2) statistic (where an R(2) of 1.00 = perfect fit). As hypothesized, the force application patterns of the competitive sprinters deviated significantly more from the simple-spring pattern than those of the athlete, nonsprinters across the three test speeds (R(2) <0.85 vs. R(2) ≥ 0.91, respectively), and deviated most at top speed (R(2) = 0.78 ± 0.02). Sprinters attained faster top speeds than nonsprinters (10.4 ± 0.3 vs. 8.7 ± 0.3 m/s) by applying greater vertical forces during the first half (2.65 ± 0.05 vs. 2.21 ± 0.05 body wt), but not the second half (1.71 ± 0.04 vs. 1.73 ± 0.04 body wt) of the stance phase. We conclude that a passive, simple-spring model has limited application to sprint running performance because the swiftest runners use an asymmetrical pattern of force application to maximize ground reaction forces and attain faster speeds. Copyright © 2014 the American Physiological Society.

  13. Bi-SOC-states in one-dimensional random cellular automaton

    NASA Astrophysics Data System (ADS)

    Czechowski, Zbigniew; Budek, Agnieszka; Białecki, Mariusz

    2017-10-01

    Two statistically stationary states with power-law scaling of avalanches are found in a simple 1 D cellular automaton. Features of the fixed points, the spiral saddle and the saddle with index 1, are investigated. The migration of states of the automaton between these two self-organized criticality states is demonstrated during evolution of the system in computer simulations. The automaton, being a slowly driven system, can be applied as a toy model of earthquake supercycles.

  14. A data compression technique for synthetic aperture radar images

    NASA Technical Reports Server (NTRS)

    Frost, V. S.; Minden, G. J.

    1986-01-01

    A data compression technique is developed for synthetic aperture radar (SAR) imagery. The technique is based on an SAR image model and is designed to preserve the local statistics in the image by an adaptive variable rate modification of block truncation coding (BTC). A data rate of approximately 1.6 bit/pixel is achieved with the technique while maintaining the image quality and cultural (pointlike) targets. The algorithm requires no large data storage and is computationally simple.

  15. Geometric, Statistical, and Topological Modeling of Intrinsic Data Manifolds: Application to 3D Shapes

    DTIC Science & Technology

    2009-01-01

    representation to a simple curve in 3D by using the Whitney embedding theorem. In a very ludic way, we propose to combine phases one and two to...elimination principle which takes advantage of the designed parametrization. To further refine discrimination among objects, we introduce a post...packing numbers and design of principal curves. IEEE transactions on Pattern Analysis and Machine Intel- ligence, 22(3):281-297, 2000. [68] M. H. Yang, Face

  16. A mixed-effects model approach for the statistical analysis of vocal fold viscoelastic shear properties.

    PubMed

    Xu, Chet C; Chan, Roger W; Sun, Han; Zhan, Xiaowei

    2017-11-01

    A mixed-effects model approach was introduced in this study for the statistical analysis of rheological data of vocal fold tissues, in order to account for the data correlation caused by multiple measurements of each tissue sample across the test frequency range. Such data correlation had often been overlooked in previous studies in the past decades. The viscoelastic shear properties of the vocal fold lamina propria of two commonly used laryngeal research animal species (i.e. rabbit, porcine) were measured by a linear, controlled-strain simple-shear rheometer. Along with published canine and human rheological data, the vocal fold viscoelastic shear moduli of these animal species were compared to those of human over a frequency range of 1-250Hz using the mixed-effects models. Our results indicated that tissues of the rabbit, canine and porcine vocal fold lamina propria were significantly stiffer and more viscous than those of human. Mixed-effects models were shown to be able to more accurately analyze rheological data generated from repeated measurements. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Adaptive design optimization: a mutual information-based approach to model discrimination in cognitive science.

    PubMed

    Cavagnaro, Daniel R; Myung, Jay I; Pitt, Mark A; Kujala, Janne V

    2010-04-01

    Discriminating among competing statistical models is a pressing issue for many experimentalists in the field of cognitive science. Resolving this issue begins with designing maximally informative experiments. To this end, the problem to be solved in adaptive design optimization is identifying experimental designs under which one can infer the underlying model in the fewest possible steps. When the models under consideration are nonlinear, as is often the case in cognitive science, this problem can be impossible to solve analytically without simplifying assumptions. However, as we show in this letter, a full solution can be found numerically with the help of a Bayesian computational trick derived from the statistics literature, which recasts the problem as a probability density simulation in which the optimal design is the mode of the density. We use a utility function based on mutual information and give three intuitive interpretations of the utility function in terms of Bayesian posterior estimates. As a proof of concept, we offer a simple example application to an experiment on memory retention.

  18. A classification procedure for the effective management of changes during the maintenance process

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Basili, Victor R.

    1992-01-01

    During software operation, maintainers are often faced with numerous change requests. Given available resources such as effort and calendar time, changes, if approved, have to be planned to fit within budget and schedule constraints. In this paper, we address the issue of assessing the difficulty of a change based on known or predictable data. This paper should be considered as a first step towards the construction of customized economic models for maintainers. In it, we propose a modeling approach, based on regular statistical techniques, that can be used in a variety of software maintenance environments. The approach can be easily automated, and is simple for people with limited statistical experience to use. Moreover, it deals effectively with the uncertainty usually associated with both model inputs and outputs. The modeling approach is validated on a data set provided by NASA/GSFC which shows it was effective in classifying changes with respect to the effort involved in implementing them. Other advantages of the approach are discussed along with additional steps to improve the results.

  19. A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

    PubMed

    Lin, Johnny; Bentler, Peter M

    2012-01-01

    Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.

  20. RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

    PubMed Central

    Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

    2007-01-01

    Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253

  1. Learning investment indicators through data extension

    NASA Astrophysics Data System (ADS)

    Dvořák, Marek

    2017-07-01

    Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.

  2. Process air quality data

    NASA Technical Reports Server (NTRS)

    Butler, C. M.; Hogge, J. E.

    1978-01-01

    Air quality sampling was conducted. Data for air quality parameters, recorded on written forms, punched cards or magnetic tape, are available for 1972 through 1975. Computer software was developed to (1) calculate several daily statistical measures of location, (2) plot time histories of data or the calculated daily statistics, (3) calculate simple correlation coefficients, and (4) plot scatter diagrams. Computer software was developed for processing air quality data to include time series analysis and goodness of fit tests. Computer software was developed to (1) calculate a larger number of daily statistical measures of location, and a number of daily monthly and yearly measures of location, dispersion, skewness and kurtosis, (2) decompose the extended time series model and (3) perform some goodness of fit tests. The computer program is described, documented and illustrated by examples. Recommendations are made for continuation of the development of research on processing air quality data.

  3. Velocity bias in the distribution of dark matter halos

    NASA Astrophysics Data System (ADS)

    Baldauf, Tobias; Desjacques, Vincent; Seljak, Uroš

    2015-12-01

    The standard formalism for the coevolution of halos and dark matter predicts that any initial halo velocity bias rapidly decays to zero. We argue that, when the purpose is to compute statistics like power spectra etc., the coupling in the momentum conservation equation for the biased tracers must be modified. Our new formulation predicts the constancy in time of any statistical halo velocity bias present in the initial conditions, in agreement with peak theory. We test this prediction by studying the evolution of a conserved halo population in N -body simulations. We establish that the initial simulated halo density and velocity statistics show distinct features of the peak model and, thus, deviate from the simple local Lagrangian bias. We demonstrate, for the first time, that the time evolution of their velocity is in tension with the rapid decay expected in the standard approach.

  4. Effects of Nongray Opacity on Radiatively Driven Wolf-Rayet Winds

    NASA Astrophysics Data System (ADS)

    Onifer, A. J.; Gayley, K. G.

    2002-05-01

    Wolf-Rayet winds are characterized by their large momentum fluxes, and simulations of radiation driving have been increasingly successful in modeling these winds. Simple analytic approaches that help understand the most critical processes for copious momentum deposition already exist in the effectively gray approximation, but these have not been extended to more realistic nongray opacities. With this in mind, we have developed a simplified theory for describing the interaction of the stellar flux with nongray wind opacity. We replace the detailed line list with a set of statistical parameters that are sensitive not only to the strength but also the wavelength distribution of lines, incorporating as a free parameter the rate of photon frequency redistribution. We label the resulting flux-weighted opacity the statistical Sobolev- Rosseland (SSR) mean, and explore how changing these various statistical parameters affects the flux/opacity interaction. We wish to acknowledge NSF grant AST-0098155

  5. A Physics-Inspired Mechanistic Model of Migratory Movement Patterns in Birds.

    PubMed

    Revell, Christopher; Somveille, Marius

    2017-08-29

    In this paper, we introduce a mechanistic model of migratory movement patterns in birds, inspired by ideas and methods from physics. Previous studies have shed light on the factors influencing bird migration but have mainly relied on statistical correlative analysis of tracking data. Our novel method offers a bottom up explanation of population-level migratory movement patterns. It differs from previous mechanistic models of animal migration and enables predictions of pathways and destinations from a given starting location. We define an environmental potential landscape from environmental data and simulate bird movement within this landscape based on simple decision rules drawn from statistical mechanics. We explore the capacity of the model by qualitatively comparing simulation results to the non-breeding migration patterns of a seabird species, the Black-browed Albatross (Thalassarche melanophris). This minimal, two-parameter model was able to capture remarkably well the previously documented migration patterns of the Black-browed Albatross, with the best combination of parameter values conserved across multiple geographically separate populations. Our physics-inspired mechanistic model could be applied to other bird and highly-mobile species, improving our understanding of the relative importance of various factors driving migration and making predictions that could be useful for conservation.

  6. Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.

    PubMed

    Cooper, Jennifer N; Wei, Lai; Fernandez, Soledad A; Minneci, Peter C; Deans, Katherine J

    2015-02-01

    The accurate prediction of surgical risk is important to patients and physicians. Logistic regression (LR) models are typically used to estimate these risks. However, in the fields of data mining and machine-learning, many alternative classification and prediction algorithms have been developed. This study aimed to compare the performance of LR to several data mining algorithms for predicting 30-day surgical morbidity in children. We used the 2012 National Surgical Quality Improvement Program-Pediatric dataset to compare the performance of (1) a LR model that assumed linearity and additivity (simple LR model) (2) a LR model incorporating restricted cubic splines and interactions (flexible LR model) (3) a support vector machine, (4) a random forest and (5) boosted classification trees for predicting surgical morbidity. The ensemble-based methods showed significantly higher accuracy, sensitivity, specificity, PPV, and NPV than the simple LR model. However, none of the models performed better than the flexible LR model in terms of the aforementioned measures or in model calibration or discrimination. Support vector machines, random forests, and boosted classification trees do not show better performance than LR for predicting pediatric surgical morbidity. After further validation, the flexible LR model derived in this study could be used to assist with clinical decision-making based on patient-specific surgical risks. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

    NASA Astrophysics Data System (ADS)

    Hoffman, A.; Forest, C. E.; Kemanian, A.

    2016-12-01

    A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.

  8. Distinct polymer physics principles govern chromatin dynamics in mouse and Drosophila topological domains.

    PubMed

    Ea, Vuthy; Sexton, Tom; Gostan, Thierry; Herviou, Laurie; Baudement, Marie-Odile; Zhang, Yunzhe; Berlivet, Soizik; Le Lay-Taha, Marie-Noëlle; Cathala, Guy; Lesne, Annick; Victor, Jean-Marc; Fan, Yuhong; Cavalli, Giacomo; Forné, Thierry

    2015-08-15

    In higher eukaryotes, the genome is partitioned into large "Topologically Associating Domains" (TADs) in which the chromatin displays favoured long-range contacts. While a crumpled/fractal globule organization has received experimental supports at higher-order levels, the organization principles that govern chromatin dynamics within these TADs remain unclear. Using simple polymer models, we previously showed that, in mouse liver cells, gene-rich domains tend to adopt a statistical helix shape when no significant locus-specific interaction takes place. Here, we use data from diverse 3C-derived methods to explore chromatin dynamics within mouse and Drosophila TADs. In mouse Embryonic Stem Cells (mESC), that possess large TADs (median size of 840 kb), we show that the statistical helix model, but not globule models, is relevant not only in gene-rich TADs, but also in gene-poor and gene-desert TADs. Interestingly, this statistical helix organization is considerably relaxed in mESC compared to liver cells, indicating that the impact of the constraints responsible for this organization is weaker in pluripotent cells. Finally, depletion of histone H1 in mESC alters local chromatin flexibility but not the statistical helix organization. In Drosophila, which possesses TADs of smaller sizes (median size of 70 kb), we show that, while chromatin compaction and flexibility are finely tuned according to the epigenetic landscape, chromatin dynamics within TADs is generally compatible with an unconstrained polymer configuration. Models issued from polymer physics can accurately describe the organization principles governing chromatin dynamics in both mouse and Drosophila TADs. However, constraints applied on this dynamics within mammalian TADs have a peculiar impact resulting in a statistical helix organization.

  9. Stochastic output error vibration-based damage detection and assessment in structures under earthquake excitation

    NASA Astrophysics Data System (ADS)

    Sakellariou, J. S.; Fassois, S. D.

    2006-11-01

    A stochastic output error (OE) vibration-based methodology for damage detection and assessment (localization and quantification) in structures under earthquake excitation is introduced. The methodology is intended for assessing the state of a structure following potential damage occurrence by exploiting vibration signal measurements produced by low-level earthquake excitations. It is based upon (a) stochastic OE model identification, (b) statistical hypothesis testing procedures for damage detection, and (c) a geometric method (GM) for damage assessment. The methodology's advantages include the effective use of the non-stationary and limited duration earthquake excitation, the handling of stochastic uncertainties, the tackling of the damage localization and quantification subproblems, the use of "small" size, simple and partial (in both the spatial and frequency bandwidth senses) identified OE-type models, and the use of a minimal number of measured vibration signals. Its feasibility and effectiveness are assessed via Monte Carlo experiments employing a simple simulation model of a 6 storey building. It is demonstrated that damage levels of 5% and 20% reduction in a storey's stiffness characteristics may be properly detected and assessed using noise-corrupted vibration signals.

  10. A simple approach to quantitative analysis using three-dimensional spectra based on selected Zernike moments.

    PubMed

    Zhai, Hong Lin; Zhai, Yue Yuan; Li, Pei Zhen; Tian, Yue Li

    2013-01-21

    A very simple approach to quantitative analysis is proposed based on the technology of digital image processing using three-dimensional (3D) spectra obtained by high-performance liquid chromatography coupled with a diode array detector (HPLC-DAD). As the region-based shape features of a grayscale image, Zernike moments with inherently invariance property were employed to establish the linear quantitative models. This approach was applied to the quantitative analysis of three compounds in mixed samples using 3D HPLC-DAD spectra, and three linear models were obtained, respectively. The correlation coefficients (R(2)) for training and test sets were more than 0.999, and the statistical parameters and strict validation supported the reliability of established models. The analytical results suggest that the Zernike moment selected by stepwise regression can be used in the quantitative analysis of target compounds. Our study provides a new idea for quantitative analysis using 3D spectra, which can be extended to the analysis of other 3D spectra obtained by different methods or instruments.

  11. No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.

    PubMed

    Li, Xuelong; Guo, Qun; Lu, Xiaoqiang

    2016-05-13

    It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.

  12. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.

    PubMed

    Westfall, Jacob; Kenny, David A; Judd, Charles M

    2014-10-01

    Researchers designing experiments in which a sample of participants responds to a sample of stimuli are faced with difficult questions about optimal study design. The conventional procedures of statistical power analysis fail to provide appropriate answers to these questions because they are based on statistical models in which stimuli are not assumed to be a source of random variation in the data, models that are inappropriate for experiments involving crossed random factors of participants and stimuli. In this article, we present new methods of power analysis for designs with crossed random factors, and we give detailed, practical guidance to psychology researchers planning experiments in which a sample of participants responds to a sample of stimuli. We extensively examine 5 commonly used experimental designs, describe how to estimate statistical power in each, and provide power analysis results based on a reasonable set of default parameter values. We then develop general conclusions and formulate rules of thumb concerning the optimal design of experiments in which a sample of participants responds to a sample of stimuli. We show that in crossed designs, statistical power typically does not approach unity as the number of participants goes to infinity but instead approaches a maximum attainable power value that is possibly small, depending on the stimulus sample. We also consider the statistical merits of designs involving multiple stimulus blocks. Finally, we provide a simple and flexible Web-based power application to aid researchers in planning studies with samples of stimuli.

  13. Statistical testing of association between menstruation and migraine.

    PubMed

    Barra, Mathias; Dahl, Fredrik A; Vetvik, Kjersti G

    2015-02-01

    To repair and refine a previously proposed method for statistical analysis of association between migraine and menstruation. Menstrually related migraine (MRM) affects about 20% of female migraineurs in the general population. The exact pathophysiological link from menstruation to migraine is hypothesized to be through fluctuations in female reproductive hormones, but the exact mechanisms remain unknown. Therefore, the main diagnostic criterion today is concurrency of migraine attacks with menstruation. Methods aiming to exclude spurious associations are wanted, so that further research into these mechanisms can be performed on a population with a true association. The statistical method is based on a simple two-parameter null model of MRM (which allows for simulation modeling), and Fisher's exact test (with mid-p correction) applied to standard 2 × 2 contingency tables derived from the patients' headache diaries. Our method is a corrected version of a previously published flawed framework. To our best knowledge, no other published methods for establishing a menstruation-migraine association by statistical means exist today. The probabilistic methodology shows good performance when subjected to receiver operator characteristic curve analysis. Quick reference cutoff values for the clinical setting were tabulated for assessing association given a patient's headache history. In this paper, we correct a proposed method for establishing association between menstruation and migraine by statistical methods. We conclude that the proposed standard of 3-cycle observations prior to setting an MRM diagnosis should be extended with at least one perimenstrual window to obtain sufficient information for statistical processing. © 2014 American Headache Society.

  14. Round Robin Study: Molecular Simulation of Thermodynamic Properties from Models with Internal Degrees of Freedom.

    PubMed

    Schappals, Michael; Mecklenfeld, Andreas; Kröger, Leif; Botan, Vitalie; Köster, Andreas; Stephan, Simon; García, Edder J; Rutkai, Gabor; Raabe, Gabriele; Klein, Peter; Leonhard, Kai; Glass, Colin W; Lenhard, Johannes; Vrabec, Jadran; Hasse, Hans

    2017-09-12

    Thermodynamic properties are often modeled by classical force fields which describe the interactions on the atomistic scale. Molecular simulations are used for retrieving thermodynamic data from such models, and many simulation techniques and computer codes are available for that purpose. In the present round robin study, the following fundamental question is addressed: Will different user groups working with different simulation codes obtain coinciding results within the statistical uncertainty of their data? A set of 24 simple simulation tasks is defined and solved by five user groups working with eight molecular simulation codes: DL_POLY, GROMACS, IMC, LAMMPS, ms2, NAMD, Tinker, and TOWHEE. Each task consists of the definition of (1) a pure fluid that is described by a force field and (2) the conditions under which that property is to be determined. The fluids are four simple alkanes: ethane, propane, n-butane, and iso-butane. All force fields consider internal degrees of freedom: OPLS, TraPPE, and a modified OPLS version with bond stretching vibrations. Density and potential energy are determined as a function of temperature and pressure on a grid which is specified such that all states are liquid. The user groups worked independently and reported their results to a central instance. The full set of results was disclosed to all user groups only at the end of the study. During the study, the central instance gave only qualitative feedback. The results reveal the challenges of carrying out molecular simulations. Several iterations were needed to eliminate gross errors. For most simulation tasks, the remaining deviations between the results of the different groups are acceptable from a practical standpoint, but they are often outside of the statistical errors of the individual simulation data. However, there are also cases where the deviations are unacceptable. This study highlights similarities between computer experiments and laboratory experiments, which are both subject not only to statistical error but also to systematic error.

  15. Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics

    PubMed Central

    Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas

    2011-01-01

    Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556

  16. α -induced reactions on 115In: Cross section measurements and statistical model analysis

    NASA Astrophysics Data System (ADS)

    Kiss, G. G.; Szücs, T.; Mohr, P.; Török, Zs.; Huszánk, R.; Gyürky, Gy.; Fülöp, Zs.

    2018-05-01

    Background: α -nucleus optical potentials are basic ingredients of statistical model calculations used in nucleosynthesis simulations. While the nucleon+nucleus optical potential is fairly well known, for the α +nucleus optical potential several different parameter sets exist and large deviations, reaching sometimes even an order of magnitude, are found between the cross section predictions calculated using different parameter sets. Purpose: A measurement of the radiative α -capture and the α -induced reaction cross sections on the nucleus 115In at low energies allows a stringent test of statistical model predictions. Since experimental data are scarce in this mass region, this measurement can be an important input to test the global applicability of α +nucleus optical model potentials and further ingredients of the statistical model. Methods: The reaction cross sections were measured by means of the activation method. The produced activities were determined by off-line detection of the γ rays and characteristic x rays emitted during the electron capture decay of the produced Sb isotopes. The 115In(α ,γ )119Sb and 115In(α ,n )Sb118m reaction cross sections were measured between Ec .m .=8.83 and 15.58 MeV, and the 115In(α ,n )Sb118g reaction was studied between Ec .m .=11.10 and 15.58 MeV. The theoretical analysis was performed within the statistical model. Results: The simultaneous measurement of the (α ,γ ) and (α ,n ) cross sections allowed us to determine a best-fit combination of all parameters for the statistical model. The α +nucleus optical potential is identified as the most important input for the statistical model. The best fit is obtained for the new Atomki-V1 potential, and good reproduction of the experimental data is also achieved for the first version of the Demetriou potentials and the simple McFadden-Satchler potential. The nucleon optical potential, the γ -ray strength function, and the level density parametrization are also constrained by the data although there is no unique best-fit combination. Conclusions: The best-fit calculations allow us to extrapolate the low-energy (α ,γ ) cross section of 115In to the astrophysical Gamow window with reasonable uncertainties. However, still further improvements of the α -nucleus potential are required for a global description of elastic (α ,α ) scattering and α -induced reactions in a wide range of masses and energies.

  17. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    NASA Astrophysics Data System (ADS)

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    2016-09-01

    This study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Second, using a newly developed proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ˜ 2°, than those from the three empirical models with averaged errors > ˜ 5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. This study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.

  18. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    DOE PAGES

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    2016-09-21

    Our study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Furthermore, using a newly developedmore » proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ~2°, than those from the three empirical models with averaged errors > ~5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. Finally, this study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.« less

  19. Weak lensing shear and aperture mass from linear to non-linear scales

    NASA Astrophysics Data System (ADS)

    Munshi, Dipak; Valageas, Patrick; Barber, Andrew J.

    2004-05-01

    We describe the predictions for the smoothed weak lensing shear, γs, and aperture mass,Map, of two simple analytical models of the density field: the minimal tree model and the stellar model. Both models give identical results for the statistics of the three-dimensional density contrast smoothed over spherical cells and only differ by the detailed angular dependence of the many-body density correlations. We have shown in previous work that they also yield almost identical results for the probability distribution function (PDF) of the smoothed convergence, κs. We find that the two models give rather close results for both the shear and the positive tail of the aperture mass. However, we note that at small angular scales (θs<~ 2 arcmin) the tail of the PDF, , for negative Map shows a strong variation between the two models, and the stellar model actually breaks down for θs<~ 0.4 arcmin and Map < 0. This shows that the statistics of the aperture mass provides a very precise probe of the detailed structure of the density field, as it is sensitive to both the amplitude and the detailed angular behaviour of the many-body correlations. On the other hand, the minimal tree model shows good agreement with numerical simulations over all the scales and redshifts of interest, while both models provide a good description of the PDF, , of the smoothed shear components. Therefore, the shear and the aperture mass provide robust and complementary tools to measure the cosmological parameters as well as the detailed statistical properties of the density field.

  20. Determination of errors in derived magnetic field directions in geosynchronous orbit: results from a statistical approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yue; Cunningham, Gregory; Henderson, Michael

    Our study aims to statistically estimate the errors in local magnetic field directions that are derived from electron directional distributions measured by Los Alamos National Laboratory geosynchronous (LANL GEO) satellites. First, by comparing derived and measured magnetic field directions along the GEO orbit to those calculated from three selected empirical global magnetic field models (including a static Olson and Pfitzer 1977 quiet magnetic field model, a simple dynamic Tsyganenko 1989 model, and a sophisticated dynamic Tsyganenko 2001 storm model), it is shown that the errors in both derived and modeled directions are at least comparable. Furthermore, using a newly developedmore » proxy method as well as comparing results from empirical models, we are able to provide for the first time circumstantial evidence showing that derived magnetic field directions should statistically match the real magnetic directions better, with averaged errors < ~2°, than those from the three empirical models with averaged errors > ~5°. In addition, our results suggest that the errors in derived magnetic field directions do not depend much on magnetospheric activity, in contrast to the empirical field models. Finally, as applications of the above conclusions, we show examples of electron pitch angle distributions observed by LANL GEO and also take the derived magnetic field directions as the real ones so as to test the performance of empirical field models along the GEO orbits, with results suggesting dependence on solar cycles as well as satellite locations. Finally, this study demonstrates the validity and value of the method that infers local magnetic field directions from particle spin-resolved distributions.« less

  1. Selecting long-term care facilities with high use of acute hospitalisations: issues and options

    PubMed Central

    2014-01-01

    Background This paper considers approaches to the question “Which long-term care facilities have residents with high use of acute hospitalisations?” It compares four methods of identifying long-term care facilities with high use of acute hospitalisations by demonstrating four selection methods, identifies key factors to be resolved when deciding which methods to employ, and discusses their appropriateness for different research questions. Methods OPAL was a census-type survey of aged care facilities and residents in Auckland, New Zealand, in 2008. It collected information about facility management and resident demographics, needs and care. Survey records (149 aged care facilities, 6271 residents) were linked to hospital and mortality records routinely assembled by health authorities. The main ranking endpoint was acute hospitalisations for diagnoses that were classified as potentially avoidable. Facilities were ranked using 1) simple event counts per person, 2) event rates per year of resident follow-up, 3) statistical model of rates using four predictors, and 4) change in ranks between methods 2) and 3). A generalized mixed model was used for Method 3 to handle the clustered nature of the data. Results 3048 potentially avoidable hospitalisations were observed during 22 months’ follow-up. The same “top ten” facilities were selected by Methods 1 and 2. The statistical model (Method 3), predicting rates from resident and facility characteristics, ranked facilities differently than these two simple methods. The change-in-ranks method identified a very different set of “top ten” facilities. All methods showed a continuum of use, with no clear distinction between facilities with higher use. Conclusion Choice of selection method should depend upon the purpose of selection. To monitor performance during a period of change, a recent simple rate, count per resident, or even count per bed, may suffice. To find high–use facilities regardless of resident needs, recent history of admissions is highly predictive. To target a few high-use facilities that have high rates after considering facility and resident characteristics, model residuals or a large increase in rank may be preferable. PMID:25052433

  2. Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

    PubMed Central

    Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

    2011-01-01

    Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309

  3. People adopt optimal policies in simple decision-making, after practice and guidance.

    PubMed

    Evans, Nathan J; Brown, Scott D

    2017-04-01

    Organisms making repeated simple decisions are faced with a tradeoff between urgent and cautious strategies. While animals can adopt a statistically optimal policy for this tradeoff, findings about human decision-makers have been mixed. Some studies have shown that people can optimize this "speed-accuracy tradeoff", while others have identified a systematic bias towards excessive caution. These issues have driven theoretical development and spurred debate about the nature of human decision-making. We investigated a potential resolution to the debate, based on two factors that routinely differ between human and animal studies of decision-making: the effects of practice, and of longer-term feedback. Our study replicated the finding that most people, by default, are overly cautious. When given both practice and detailed feedback, people moved rapidly towards the optimal policy, with many participants reaching optimality with less than 1 h of practice. Our findings have theoretical implications for cognitive and neural models of simple decision-making, as well as methodological implications.

  4. An unexpected way forward: towards a more accurate and rigorous protein-protein binding affinity scoring function by eliminating terms from an already simple scoring function.

    PubMed

    Swanson, Jon; Audie, Joseph

    2018-01-01

    A fundamental and unsolved problem in biophysical chemistry is the development of a computationally simple, physically intuitive, and generally applicable method for accurately predicting and physically explaining protein-protein binding affinities from protein-protein interaction (PPI) complex coordinates. Here, we propose that the simplification of a previously described six-term PPI scoring function to a four term function results in a simple expression of all physically and statistically meaningful terms that can be used to accurately predict and explain binding affinities for a well-defined subset of PPIs that are characterized by (1) crystallographic coordinates, (2) rigid-body association, (3) normal interface size, and hydrophobicity and hydrophilicity, and (4) high quality experimental binding affinity measurements. We further propose that the four-term scoring function could be regarded as a core expression for future development into a more general PPI scoring function. Our work has clear implications for PPI modeling and structure-based drug design.

  5. Spatial-temporal modeling of malware propagation in networks.

    PubMed

    Chen, Zesheng; Ji, Chuanyi

    2005-09-01

    Network security is an important task of network management. One threat to network security is malware (malicious software) propagation. One type of malware is called topological scanning that spreads based on topology information. The focus of this work is on modeling the spread of topological malwares, which is important for understanding their potential damages, and for developing countermeasures to protect the network infrastructure. Our model is motivated by probabilistic graphs, which have been widely investigated in machine learning. We first use a graphical representation to abstract the propagation of malwares that employ different scanning methods. We then use a spatial-temporal random process to describe the statistical dependence of malware propagation in arbitrary topologies. As the spatial dependence is particularly difficult to characterize, the problem becomes how to use simple (i.e., biased) models to approximate the spatially dependent process. In particular, we propose the independent model and the Markov model as simple approximations. We conduct both theoretical analysis and extensive simulations on large networks using both real measurements and synthesized topologies to test the performance of the proposed models. Our results show that the independent model can capture temporal dependence and detailed topology information and, thus, outperforms the previous models, whereas the Markov model incorporates a certain spatial dependence and, thus, achieves a greater accuracy in characterizing both transient and equilibrium behaviors of malware propagation.

  6. A BRDF statistical model applying to space target materials modeling

    NASA Astrophysics Data System (ADS)

    Liu, Chenghao; Li, Zhi; Xu, Can; Tian, Qichen

    2017-10-01

    In order to solve the problem of poor effect in modeling the large density BRDF measured data with five-parameter semi-empirical model, a refined statistical model of BRDF which is suitable for multi-class space target material modeling were proposed. The refined model improved the Torrance-Sparrow model while having the modeling advantages of five-parameter model. Compared with the existing empirical model, the model contains six simple parameters, which can approximate the roughness distribution of the material surface, can approximate the intensity of the Fresnel reflectance phenomenon and the attenuation of the reflected light's brightness with the azimuth angle changes. The model is able to achieve parameter inversion quickly with no extra loss of accuracy. The genetic algorithm was used to invert the parameters of 11 different samples in the space target commonly used materials, and the fitting errors of all materials were below 6%, which were much lower than those of five-parameter model. The effect of the refined model is verified by comparing the fitting results of the three samples at different incident zenith angles in 0° azimuth angle. Finally, the three-dimensional modeling visualizations of these samples in the upper hemisphere space was given, in which the strength of the optical scattering of different materials could be clearly shown. It proved the good describing ability of the refined model at the material characterization as well.

  7. Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

    PubMed

    Mi, Gu; Di, Yanming; Schafer, Daniel W

    2015-01-01

    This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.

  8. Aerosol Complexity and Implications for Predictability and Short-Term Forecasting

    NASA Technical Reports Server (NTRS)

    Colarco, Peter

    2016-01-01

    There are clear NWP and climate impacts from including aerosol radiative and cloud interactions. Changes in dynamics and cloud fields affect aerosol lifecycle, plume height, long-range transport, overall forcing of the climate system, etc. Inclusion of aerosols in NWP systems has benefit to surface field biases (e.g., T2m, U10m). Including aerosol affects has impact on analysis increments and can have statistically significant impacts on, e.g., tropical cyclogenesis. Above points are made especially with respect to aerosol radiative interactions, but aerosol-cloud interaction is a bigger signal on the global system. Many of these impacts are realized even in models with relatively simple (bulk) aerosol schemes (approx.10 -20 tracers). Simple schemes though imply simple representation of aerosol absorption and importantly for aerosol-cloud interaction particle-size distribution. Even so, more complex schemes exhibit a lot of diversity between different models, with issues such as size selection both for emitted particles and for modes. Prospects for complex sectional schemes to tune modal (and even bulk) schemes toward better selection of size representation. I think this is a ripe topic for more research -Systematic documentation of benefits of no vs. climatological vs. interactive (direct and then direct+indirect) aerosols. Document aerosol impact on analysis increments, inclusion in NWP data assimilation operator -Further refinement of baseline assumptions in model design (e.g., absorption, particle size distribution). Did not get into model resolution and interplay of other physical processes with aerosols (e.g., moist physics, obviously important), chemistry

  9. Simple, empirical approach to predict neutron capture cross sections from nuclear masses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Couture, Aaron Joseph; Casten, Richard F.; Cakirli, R. B.

    Here, neutron capture cross sections are essential to understanding the astrophysical s and r processes, the modeling of nuclear reactor design and performance, and for a wide variety of nuclear forensics applications. Often, cross sections are needed for nuclei where experimental measurements are difficult. Enormous effort, over many decades, has gone into attempting to develop sophisticated statistical reaction models to predict these cross sections. Such work has met with some success but is often unable to reproduce measured cross sections to better than 40%, and has limited predictive power, with predictions from different models rapidly differing by an order ofmore » magnitude a few nucleons from the last measurement.« less

  10. Simple, empirical approach to predict neutron capture cross sections from nuclear masses

    DOE PAGES

    Couture, Aaron Joseph; Casten, Richard F.; Cakirli, R. B.

    2017-12-20

    Here, neutron capture cross sections are essential to understanding the astrophysical s and r processes, the modeling of nuclear reactor design and performance, and for a wide variety of nuclear forensics applications. Often, cross sections are needed for nuclei where experimental measurements are difficult. Enormous effort, over many decades, has gone into attempting to develop sophisticated statistical reaction models to predict these cross sections. Such work has met with some success but is often unable to reproduce measured cross sections to better than 40%, and has limited predictive power, with predictions from different models rapidly differing by an order ofmore » magnitude a few nucleons from the last measurement.« less

  11. Capillarity theory for the fly-casting mechanism

    PubMed Central

    Trizac, Emmanuel; Levy, Yaakov; Wolynes, Peter G.

    2010-01-01

    Biomolecular folding and function are often coupled. During molecular recognition events, one of the binding partners may transiently or partially unfold, allowing more rapid access to a binding site. We describe a simple model for this fly-casting mechanism based on the capillarity approximation and polymer chain statistics. The model shows that fly casting is most effective when the protein unfolding barrier is small and the part of the chain which extends toward the target is relatively rigid. These features are often seen in known examples of fly casting in protein–DNA binding. Simulations of protein–DNA binding based on well-funneled native-topology models with electrostatic forces confirm the trends of the analytical theory. PMID:20133683

  12. Current Status and Challenges of Atmospheric Data Assimilation

    NASA Astrophysics Data System (ADS)

    Atlas, R. M.; Gelaro, R.

    2016-12-01

    The issues of modern atmospheric data assimilation are fairly simple to comprehend but difficult to address, involving the combination of literally billions of model variables and tens of millions of observations daily. In addition to traditional meteorological variables such as wind, temperature pressure and humidity, model state vectors are being expanded to include explicit representation of precipitation, clouds, aerosols and atmospheric trace gases. At the same time, model resolutions are approaching single-kilometer scales globally and new observation types have error characteristics that are increasingly non-Gaussian. This talk describes the current status and challenges of atmospheric data assimilation, including an overview of current methodologies, the difficulty of estimating error statistics, and progress toward coupled earth system analyses.

  13. Approximate Model Checking of PCTL Involving Unbounded Path Properties

    NASA Astrophysics Data System (ADS)

    Basu, Samik; Ghosh, Arka P.; He, Ru

    We study the problem of applying statistical methods for approximate model checking of probabilistic systems against properties encoded as PCTL formulas. Such approximate methods have been proposed primarily to deal with state-space explosion that makes the exact model checking by numerical methods practically infeasible for large systems. However, the existing statistical methods either consider a restricted subset of PCTL, specifically, the subset that can only express bounded until properties; or rely on user-specified finite bound on the sample path length. We propose a new method that does not have such restrictions and can be effectively used to reason about unbounded until properties. We approximate probabilistic characteristics of an unbounded until property by that of a bounded until property for a suitably chosen value of the bound. In essence, our method is a two-phase process: (a) the first phase is concerned with identifying the bound k 0; (b) the second phase computes the probability of satisfying the k 0-bounded until property as an estimate for the probability of satisfying the corresponding unbounded until property. In both phases, it is sufficient to verify bounded until properties which can be effectively done using existing statistical techniques. We prove the correctness of our technique and present its prototype implementations. We empirically show the practical applicability of our method by considering different case studies including a simple infinite-state model, and large finite-state models such as IPv4 zeroconf protocol and dining philosopher protocol modeled as Discrete Time Markov chains.

  14. Functional constraints on tooth morphology in carnivorous mammals

    PubMed Central

    2012-01-01

    Background The range of potential morphologies resulting from evolution is limited by complex interacting processes, ranging from development to function. Quantifying these interactions is important for understanding adaptation and convergent evolution. Using three-dimensional reconstructions of carnivoran and dasyuromorph tooth rows, we compared statistical models of the relationship between tooth row shape and the opposing tooth row, a static feature, as well as measures of mandibular motion during chewing (occlusion), which are kinetic features. This is a new approach to quantifying functional integration because we use measures of movement and displacement, such as the amount the mandible translates laterally during occlusion, as opposed to conventional morphological measures, such as mandible length and geometric landmarks. By sampling two distantly related groups of ecologically similar mammals, we study carnivorous mammals in general rather than a specific group of mammals. Results Statistical model comparisons demonstrate that the best performing models always include some measure of mandibular motion, indicating that functional and statistical models of tooth shape as purely a function of the opposing tooth row are too simple and that increased model complexity provides a better understanding of tooth form. The predictors of the best performing models always included the opposing tooth row shape and a relative linear measure of mandibular motion. Conclusions Our results provide quantitative support of long-standing hypotheses of tooth row shape as being influenced by mandibular motion in addition to the opposing tooth row. Additionally, this study illustrates the utility and necessity of including kinetic features in analyses of morphological integration. PMID:22899809

  15. Framework for adaptive multiscale analysis of nonhomogeneous point processes.

    PubMed

    Helgason, Hannes; Bartroff, Jay; Abry, Patrice

    2011-01-01

    We develop the methodology for hypothesis testing and model selection in nonhomogeneous Poisson processes, with an eye toward the application of modeling and variability detection in heart beat data. Modeling the process' non-constant rate function using templates of simple basis functions, we develop the generalized likelihood ratio statistic for a given template and a multiple testing scheme to model-select from a family of templates. A dynamic programming algorithm inspired by network flows is used to compute the maximum likelihood template in a multiscale manner. In a numerical example, the proposed procedure is nearly as powerful as the super-optimal procedures that know the true template size and true partition, respectively. Extensions to general history-dependent point processes is discussed.

  16. PROM7: 1D modeler of solar filaments or prominences

    NASA Astrophysics Data System (ADS)

    Gouttebroze, P.

    2018-05-01

    PROM7 is an update of PROM4 (ascl:1306.004) and computes simple models of solar prominences and filaments using Partial Radiative Distribution (PRD). The models consist of plane-parallel slabs standing vertically above the solar surface. Each model is defined by 5 parameters: temperature, density, geometrical thickness, microturbulent velocity and height above the solar surface. It solves the equations of radiative transfer, statistical equilibrium, ionization and pressure equilibria, and computes electron and hydrogen level population and hydrogen line profiles. Moreover, the code treats calcium atom which is reduced to 3 ionization states (Ca I, Ca II, CA III). Ca II ion has 5 levels which are useful for computing 2 resonance lines (H and K) and infrared triplet (to 8500 A).

  17. Theory of Financial Risk and Derivative Pricing

    NASA Astrophysics Data System (ADS)

    Bouchaud, Jean-Philippe; Potters, Marc

    2009-01-01

    Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.

  18. Theory of Financial Risk and Derivative Pricing - 2nd Edition

    NASA Astrophysics Data System (ADS)

    Bouchaud, Jean-Philippe; Potters, Marc

    2003-12-01

    Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.

  19. Comparison of four modeling tools for the prediction of potential distribution for non-indigenous weeds in the United States

    USGS Publications Warehouse

    Magarey, Roger; Newton, Leslie; Hong, Seung C.; Takeuchi, Yu; Christie, Dave; Jarnevich, Catherine S.; Kohl, Lisa; Damus, Martin; Higgins, Steven I.; Miller, Leah; Castro, Karen; West, Amanda; Hastings, John; Cook, Gericke; Kartesz, John; Koop, Anthony

    2018-01-01

    This study compares four models for predicting the potential distribution of non-indigenous weed species in the conterminous U.S. The comparison focused on evaluating modeling tools and protocols as currently used for weed risk assessment or for predicting the potential distribution of invasive weeds. We used six weed species (three highly invasive and three less invasive non-indigenous species) that have been established in the U.S. for more than 75 years. The experiment involved providing non-U. S. location data to users familiar with one of the four evaluated techniques, who then developed predictive models that were applied to the United States without knowing the identity of the species or its U.S. distribution. We compared a simple GIS climate matching technique known as Proto3, a simple climate matching tool CLIMEX Match Climates, the correlative model MaxEnt, and a process model known as the Thornley Transport Resistance (TTR) model. Two experienced users ran each modeling tool except TTR, which had one user. Models were trained with global species distribution data excluding any U.S. data, and then were evaluated using the current known U.S. distribution. The influence of weed species identity and modeling tool on prevalence and sensitivity effects was compared using a generalized linear mixed model. Each modeling tool itself had a low statistical significance, while weed species alone accounted for 69.1 and 48.5% of the variance for prevalence and sensitivity, respectively. These results suggest that simple modeling tools might perform as well as complex ones in the case of predicting potential distribution for a weed not yet present in the United States. Considerations of model accuracy should also be balanced with those of reproducibility and ease of use. More important than the choice of modeling tool is the construction of robust protocols and testing both new and experienced users under blind test conditions that approximate operational conditions.

  20. Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

    NASA Astrophysics Data System (ADS)

    Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

    2017-12-01

    An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.

Top